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Save the children 


Infants and young people are being traumatized by armed conflict in their countries. Their 
resulting mental illnesses must be addressed, for the good of both the individuals and their society. 


them from afar. The effects on the young provoke much of the 

horror. But many other armed conflicts are occurring, often far 
less visibly, in developing countries — and these are also home to the 
world’s highest populations of children and young people. 

Under-18s are described as requiring special protection in times of 
war in the United Nations Convention on the Rights of the Child, which 
celebrates its 25th anniversary this year. The convention, although lack- 
ing the teeth of enforcement, has provided a framework for discus- 
sions and planning that has spawned useful research. That research 
has begun to identify what ‘special protectior really means — and the 
amount of time and resources it demands. 

For a country to recover from war and rebuild a functional society, 
its young generation must be physically and mentally fit. In the past 
decade or so, humanitarian organizations have become increasingly 
aware of the prevalence of mental illness. This is particularly relevant 
for children and adolescents, because research has shown beyond doubt 
that prolonged and severe stress can damage the developing brain. Poor 
countries, often confronted with life-threatening epidemics of infec- 
tious disease, are too often unable to make mental illness a priority. But 
they surely need to embed in their health-care systems mental-health 
strategies for helping their traumatized youth. 

Researchers, often supported by humanitarian organizations, have 
already undertaken scores of field studies in countries damaged by 
war or natural disasters. From Africa to Indonesia to the Balkans, 
they have tried to work out which interventions could help to miti- 
gate or avert the mental damage caused by severe stress. Common 
interventions involve structured individual or group psychotherapy 


B attles in Ukraine, Gaza and Syria have appalled all who watch 


based in schools, for example, or family counselling. 

As one might expect, the quality of societal support — an intact 
family, a trusted care-giver, a protective neighbourhood — has a major 
impact on whether an intervention will help. Still, many children 
emerge from trauma undamaged, even without an intervention. And 
an approach that works well in one context may even be harmful in 
another; for example, some displaced boys in Burundi responded neg- 
atively to a type of psychotherapy that had proved helpful in Indonesia. 

There can be no single approach to limiting the mental damage 
inflicted by war. To be useful, interventions require intense engage- 
ment in the life and experience of each individual. For example, when 
working in Bosnia in the 1990s, a US psychiatrist discovered from con- 
versations with one boy in his study group that, to get to school, the 
boy had to pass the tree from which he had witnessed his brother being 
hanged. It was helpful to bring this nightmare confrontation into their 
therapeutic sessions. 

Worryingly, new scientific results are not getting through. Many 
popular therapeutic approaches — family counselling, for one — have 
not been rigorously tested in post-conflict contexts. And psycho- 
therapy, known to be effective in post-traumatic stress disorder, is rarely 
practised, in part because of a lack of capacity to deliver it. 

Humanitarian organizations, for all their importance, might not leave 
conflict zones with sound infrastructure. This underlines again the need 
for countries to develop their own scientific and medical capacities. 

Immediate interventions in schools make sense, because rebuild- 
ing a society requires an educated next generation. But many more 
longitudinal studies are also needed to track traumatized children into 
adulthood, to see if and how the treatment they received helped them. = 


Future computing 


Pushing the boundaries of current computing 
technologies will show the way to new ones. 


hat emerging technologies promise to displace con- 

WW eensens silicon chips? Future computers could run on 

graphene, perhaps, or the hidden powers of quantum phys- 

ics or brain-like synaptic networks. Research on all these options and 

more is under way as it becomes clear that enhancement of silicon- 

chip technology is hitting serious practical obstacles: in manufactur- 
ing, connectivity and heat generation. 

No emerging technology is likely to be a get-out-jail-free card. 

Amazing performance in one area is often accompanied by serious 

limits in another. Computing based on carbon nanotubes or graphene, 


for example, presents formidable challenges in reliable fabrication. 

On page 147, information technologist Igor Markov argues that we 
should focus on the fundamental limits in computing, and use those 
to evaluate future possibilities. This approach has a rich history. Work- 
ing out the maximum efficiency of steam engines, nineteenth-century 
physicists discovered thermodynamics. Modern information science 
was born in 1948 when Claude Shannon at Bell Labs considered what 
an ideal communication channel would look like. 

Computations have limits: they take up space, time and energy. In 
2000, IT researcher Seth Lloyd calculated the computing power of the 
ultimate laptop, which, by miraculous engineering advances, could 
harness all its energy for information processing (S. Lloyd Nature 406, 
1047-1054; 2000). This ultimate machine could perform 10°' opera- 
tions per second, 40 orders of magnitude more than computers today. 
That represents 250 years of progress at current rates of improvement. 

Markov’s message is not to be overly optimistic or pessimistic about 
further progress. We should focus on the boundaries and push to see 
where they yield. = 
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position themselves ahead of talks concerning a new global 
treaty in Paris next year. 

Until now, the global deadlock on efforts to curb greenhouse-gas 
emissions has centred around the unwillingness of the United States 
to commit to a binding reduction target. This was shown most vividly 
by the nation’s rejection of the 1997 Kyoto Protocol. 

Many countries, China included, had little incentive to introduce poli- 
cies to control carbon dioxide while the United States was not doing so. 
In June, the United States signalled a shift from that position when its 
Environmental Protection Agency (EPA) unveiled a new climate plan. 

Using its authority under the Clean Air Act in lieu of congressional 
action, the EPA set a target to cut carbon pollution from power plants 
— the largest source of total US emissions — by 
30% below 2005 levels by 2030. 

Is the move a climate game changer? I believe 
that China will make some effort to react to the 
US plan. Exactly how is still unclear, but here is 
a suggestion: China, the world’s biggest green- 
house-gas emitter, should upgrade its climate 
policy from reducing carbon intensity to setting 
along-term cap on total emissions. 

The difference is important. Carbon intensity 
is measured relative to gross domestic product, 
so while the economy is growing, so too can pol- 
lution. An absolute cap attempts to break that 
link: economic growth must not drive up carbon 
emissions. 

In June, China’s long-standing chief climate 
negotiator, Xie Zhenhua, gave the strongest sig- 
nal yet that the country was considering such 
a switch. He told reporters at a meeting in Berlin that China was 
approaching a “peaking year” for its carbon emissions in the build- 
up to the Paris talks. 

To agree on an emissions cap, China must be convinced that the 
link between economic growth and emissions can be broken. Here, 
there is another strong positive message from the United States. Nine 
states in the northeast of the country have started a cap-and-trade 
programme known as the Regional Greenhouse Gas Initiative, in 
which the government places a ceiling on carbon emissions and 
allows companies to buy and sell permits for those emissions. Since 
2009, the states involved in the programme have cut their emissions 
by 18% on average, while their economies have grown by 9.2%. By 
comparison, emissions in the other 41 states fell by 4%, and their 
economies grew by 8.8%. Thus, the real chal- 


Te big players in climate-change negotiations are starting to 
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THAT THE LINK 
BETWEEN ECONOMIC 
GROWTH AND 


EMISSIONS 
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China should aim for a 
total cap on emissions 


A focus on carbon intensity alone will allow emissions to grow with the 
economy, argues Qiang Wang. 


could find it harder to continue to cut carbon intensity. With domestic 
coal demand in the United States expected to fall by 30% owing to the 
EPA rule, US coal firms — sitting on the largest recoverable reserves 
in the world — are pushing to increase exports to Asia, especially 
to China. Three new coal-export ports are being proposed for the 
Pacific coast, and are projected to ship up to 100 million tonnes of 
coal per year. The huge added supply to Asia will lead to cheaper coal 
and increased consumption. The European Union (EU) is a good 
example. Coal consumption has risen in the EU in recent years, and 
use of comparatively clean gas has fallen. This is partly because US 
coal exports to the EU sharply increased from 14 million tonnes in 
2003 to 47 million tonnes in 2013. 

It is unrealistic for China to switch immediately from cutting 
carbon intensity to a cap on emissions. A more 
rational and practical strategy is to make the 
transition in two steps. 

First, China needs to obtain better data. 
Researchers must work out when Chinese 
emissions are likely to peak, assuming that the 
economy continues to grow as expected. This 
will provide a reliable baseline for any reduc- 
tion target. It will require international scientific 
cooperation, because modelling for China must 
be informed by research results about the tra- 
jectory of emissions patterns in the EU, United 
States and other developed regions. 

The peaking year is a complex issue and 
Chinese scientists and scholars differ greatly in 
their opinions of it. But the widely accepted view 
is that China’s carbon output under the business- 
as-usual scenario will peak sometime after 2030. 

Second, China needs to prioritize the use of ‘bridging’ fuel. It is no 
coincidence that the nine US states participating in the regional scheme 
have more nuclear energy and shale gas in their portfolios than most. 

In 2011, nuclear energy accounted for less than 2% of China’s electric- 
ity, but 12% of electricity globally and 21% in member countries of the 
Organisation for Economic Co-operation and Development. China's 
technically recoverable shale-gas resources are 31.6 trillion cubic metres, 
nearly double the United States’ 18.8 trillion cubic metres. I advocate 
nuclear energy and shale gas as bridging fuels to a carbon-free future, if 
China can handle the safety and environmental concerns. 

An absolute cap on China's emissions is in sight. But it will take 
political courage and practical changes to make it a reality. m 


Qiang Wang is conjoint professor at the Xinjiang Institute of Ecology 
and Geography of the Chinese Academy of Sciences in Urumqi. 
e-mail: qiangwang7@gmail.com 


The views expressed in this article are those of the author alone. 
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Selections from the 
scientific literature 


RESEARCH HIGHLIGHTS 


GENE EDITING 


CRISPR corrects 
6-thalassaemia 


A common genetic blood 
disorder has been corrected in 
cultured stem cells by using a 
cutting-edge genome-editing 
technique. 

The disorder B-thalassaemia 
is characterized by reduced 
levels of haemoglobin due 
to mutations in the gene for 
6-globin (HBB). Yuet Kan 
and his colleagues at the 
University of California, San 
Francisco, created induced 
pluripotent stem cells using 
skin fibroblasts from a person 
with -thalassaemia. They 
then used the CRISPR-Cas9 
gene-editing technique 
to correct the unwanted 
mutation precisely, without 
affecting other genes. After 
differentiation in culture into 
precursors of red blood cells, 
the modified cells showed 
higher expression of HBB than 
unmodified cells. 

Transplantation of such 
corrected cells back into 
the original patient could 
one day provide a cure for 
-thalassaemia, say the 
authors. 

Genome Res. http://doi.org/t3v 
(2014) 


Another super- 
Earth found 


A ‘super-Earth’ planet — an 
extrasolar planet larger 

than Earth but smaller than 
Neptune — has been detected 
in the habitable zone of a star 
called Gliese 832. 

Robert Wittenmyer at the 
University of New South 
Wales in Sydney, Australia, 
and his colleagues used data 
from various telescopes to 
detect a planet with a mass 
of 5.4 Earths in orbit around 
this star. Although the planet 


POLAR SCIENCE 


Arctic snowpack thins 


As Arctic sea ice has shrunk and thinned, so has the snowpack 


blanketing it. 


Melinda Webster at the University of Washington in 
Seattle and her colleagues studied data on spring snow depth 
gathered between 2009 and 2013 by radar surveys conducted 
from the air and verified with surface measurements 
(pictured). They compared these to information collected 
between 1954 and 1991 by Soviet ice stations. The error bars 
are large, but between the older and the current surveys, snow 
thickness had decreased by some 37% in the western Arctic 
and by 56% in the Beaufort and Chukchi seas. 

As sea ice starts forming later each autumn, there is less time 
for snow to accumulate before winter sets in, the authors say. 

J. Geophys. Res. Oceans http://doi.org/t3q (2014) 


is in the habitable zone — the 
region around a star in which 
it is thought that life could 
potentially exist — its large 
size suggests that it may have 
a thick atmosphere. This 
might make it more like a 
‘super- Venus, with a dense 
atmosphere leading to high 
surface temperatures that 
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would render it inhospitable. 
Despite this, the presence 
of this potentially rocky inner 
planet, as well as a previously 
discovered outer giant planet, 
makes the Gliese 832 system 
a rare miniature version of 
our Solar System, the authors 
suggest. 
Astrophys. J. 791,114 (2014) 
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Cleaner, greener 
ammonia 


A method of producing 
ammonia could yield a greener 
route to nitrogen-based 
fertilizers. 

Ammonia is currently 
synthesized by combining 
nitrogen and hydrogen 
under high pressures and 
temperatures in a reaction 
called the Haber-Bosch 
process. Making the hydrogen 
consumes around 5% of 
the world’s natural-gas 
production, and releases large 
amounts of carbon dioxide. 

Stuart Licht at George 
Washington University in 
Washington DC and his 
colleagues applied a voltage to 
steam and air (the source of 
nitrogen) bubbling through 
molten hydroxide containing 
catalytic nanoparticles of iron 
oxide. This produced ammonia 
from nitrogen and water 
directly by electrolysis. The 
nanoparticles clump together 
over time, slowing the reaction, 
and moderate temperatures 
and pressures are still needed. 
However, if the process can 
be scaled up, it could be less 
energy-intensive than the 
current industrial method. 
Science 345, 637-640 (2014) 


Resistance genes 
mapped 


Researchers have pinpointed 
mutations encoding antibiotic 
resistance in bacteria that 
cause pneumonia, borrowing 
a technique more often used to 
hunt for gene variations linked 
to common human diseases. 
Streptococcus pneumoniae 
is a leading killer of children 
under five worldwide. The 
bacterium is prone to develop 
antibiotic resistance, but 
pinning down the mutations 
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responsible has proved 
difficult. 

A team led by Stephen 
Bentley and Julian Parkhill, 
at the Wellcome Trust Sanger 
Institute in Hinxton, UK, 
analysed the genomes of 
3,701 samples of S. pneumoniae 
collected from carriers ina 
refugee camp in Thailand and 
from patients in Massachusetts 
clinics. 

The authors searched for 
regions of the genome that 
differed between bacteria 
resistant to B-lactam antibiotics 
(such as penicillin) and those 
still susceptible to them. They 
found 301 DNA variations 
in 51 regions linked to drug 
resistance, including novel 
genes as well as those involved 
in building the cell wall, the 
target of the B-lactams. 

PLoS Genet. 10, 1004547 (2014) 


IMAGING 
Seeing througha 
mouse skull 


Glowing nanotubes have 
allowed researchers to peer 
through a mouse’ skull and 
examine its living brain in real 
time. 

Calvin Kuo and Hongjie 
Dai of Stanford University in 
California and their colleagues 
injected fluorescent molecules 
based on carbon nanotubes 
into the tails of mice. The 
nanotubes were then carried 
around in the animals’ 
bloodstreams and when lasers 
were shone onto the rodents’ 
skulls, the molecules gave off 
near-infrared light (pictured) 
that was visible through 
the bone. This allowed the 
researchers to image blood 
moving through the brain 
toa depth of more than 
2 millimetres and to detect 


obstructed arteries. However, 
the method might not be 
usable in humans because of 
our thicker skulls. 

Nature Photon. http://doi.org/ 
t2z (2014) 


SEISMOLOGY 


From earthquakes 
to icequakes 


Big earthquakes on land can 
trigger small distant ‘icequakes’ 
in the Antarctic ice sheet. 

At magnitude 8.8, the 2010 
Maule earthquake in Chile 
was the largest quake in the 
Southern Hemisphere for 
halfa century. Zhigang Peng 
at the Georgia Institute of 
Technology in Atlanta and his 
colleagues hunted for traces 
of it at seismic stations across 
Antarctica. 

They discovered 
high-frequency shaking 
representing small icequakes, 
with waves of tremors 
appearing in the kilometre- 
thick ice sheet that covers 
the frozen continent. These 
seemed to be triggered by 
the lower-frequency rumble 
stemming from the Chilean 
event, and represent the first 
evidence of links between 
quakes in the solid earth and 
in the cryosphere. 

Nature Geosci. http://dx.doi. 
org/10.1038/ngeo02212 (2014) 


STEM CELLS 


Fresh growth 
from elderly cells 


Human skin cells can be 
reprogrammed into neural 
cells that form synapses with 
neurons in severed spinal cords 
in rats. 

A team led by Paul Lu 
and Mark Tuszynski at the 
University of California San 
Diego in La Jolla took skin 
fibroblasts from an 86-year-old 
man, converted them in culture 
into induced pluripotent stem 
cells (iPS cells) and then into 
neural stem cells, and grafted 
these cells into two-week-old 
immunodeficient rats whose 
spinal cords were damaged at 
the neck. Three months later, 
the stem cells had grown into 


RESEARCH HIGHLIGHTS Mii Saiaa¢ 


SOCIALSELECTION “erence 


Clash over the Kardashians of science 


Here’s a novel approach for getting an article noticed: put 
‘Kardashian in the title. A paper that compared Twitter-using 
researchers to the celebrity Kim Kardashian incited a backlash 


on social media. 


Neil Hall, a genomics researcher at the University of 
Liverpool, UK, introduced a metric called the Kardashian 
Index, or K value. This is calculated by dividing a researcher's 
number of Twitter followers by the number of scientific 
citations he or she has. The K value supposedly identifies 
scientists whose visibility exceeds their contributions — 
somewhat like a certain socialite, Hall suggests. The article was 
intended as satire, but not everyone was amused. “This paper 
suggests only highly cited scientists deserve a large Twitter 
following, & everyone else should shut up,’ tweeted Katie Mack, 
an astrophysicist at the University of Melbourne in Australia. 


Genome Biol. 15,424 (2014) 


Based on data from altmetric.com. 
Altmetric is supported by Macmillan 
Science and Education, which owns 
Nature Publishing Group. 


neurons that projected axons 
along the whole length of the 
rat spinal cord, even extending 
into the brain. Unlike similar 
experiments with neurons 
derived from embryonic stem 
cells, these iPS-cell-derived 
neurons did not restore 
movement in the rats’ limbs, 
perhaps as a result of scar tissue 
that formed at the injury site. 
Neuron http://doi.org/t36 (2014) 


MICROBIOLOGY 


Ecosystems afloat 
in asphalt 


Water droplets suspended in 
the world’s largest tar ‘lake’ 
are teeming with diverse 
ecosystems of bacteria 
and methane-producing 
microorganisms, despite the 
inhospitable living conditions. 
Droplets just a few 
microlitres in volume that 
were isolated from Pitch Lake 
(pictured), a huge tar pit on 
the island of Trinidad, contain 
a menagerie of bacteria 
and archaea, report Rainer 
Meckenstock at the Helmholtz 
Zentrum in Munich, 
Germany, and his colleagues. 


NATURE.COM 
For more on 

popular papers: 
go.nature.com/hgeqwh 


They used DNA sequencing 
to reveal that multiple species 
work together to break down 
the oil surrounding the water 
droplets, which are thought to 
originate deep underground. 
These microhabitats could 
be an unrecognized factor in 
the biodegradation of large 
volumes of oil, the authors 
suggest. 
Science 345, 673-676 (2014) 
For a longer story on this research, 
see go.nature.com/odleal 
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For the latest research published by 
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SEVEN DAYS nescnnss 


POLICY 


US- Africa summit 
US President Barack Obama 
backed the idea of a Global 
Alliance for Climate-Smart 
Agriculture at the US—Africa 
Leaders Summit in Washington 
DC on 4-6 August. The 
partnership would bring 
together governments, industry 
and non-governmental 
organizations to help boost 
African agriculture while 
keeping farming-related 
greenhouse-gas emissions in 
check. The alliance is slated 

for launch on 23 September. 
The summit also saw Sweden 
pledge US$1 billion to Obama's 
Power Africa initiative to 
double people's access to 
electricity in sub-Saharan 
Africa. 


Emissions lawsuit 


A coalition of environmental 
groups is mounting a legal 
challenge to force the US 
Environmental Protection 
Agency (EPA) to regulate 
greenhouse-gas emissions 
from aircraft. On 5 August, the 
groups, led by the Center for 
Biological Diversity in Tucson, 
Arizona, filed a formal notice 
of intent to sue. They argue 
that the EPA should impose 
limits on aviation emissions 
under the Clean Air Act — 

the law that regulates carbon 
dioxide produced by power 
plants and vehicles. 


Licence battle 


A coalition of more than 

50 research institutions, 
funders and open-access 
publishers signed a letter 
dated 7 August protesting 
against a new set of licences 
governing open-access articles 
(see go.nature.com/agficr). 
The licences will limit the 
legal reuse of research articles 
and data that are supposed 

to be freely available to the 
public, the coalition argues. 
The Association of Scientific, 


Rosetta’s rendezvous 


The European Space Agency’s comet-chasing spacecraft 
Rosetta arrived at its destination on 6 August after a 

ten-year journey. Performing the last of a set of ten 
manoeuvres, Rosetta entered the same orbit around the Sun 
as its target, 67P/Churyumov-—Gerasimenko, to become the 
first spacecraft to rendezvous with a comet. The probe will 
study the body before attempting to place a lander, Philae, 

on its surface in November. Rosetta will continue to follow 
and measure the comet as it swings around the Sun in August 
2015. See go.nature.com/oqzeaa for more. 


Technical and Medical 
Publishers, a trade group 
headquartered in Oxford, UK, 
drew up the disputed licences. 
The letter calls for the Creative 
Commons licences to be used 
as the global standard for open 
research output. 


Timber law 


Illegal wood products are still 
entering European markets 
one year after the start ofa 
law to prevent trade in illicit 
timber. A survey conducted 
by the WWE the international 
environmental group, found 
that only 11 European Union 
countries have adopted 
national legislation and 
robust penalties. The worst 
culprits include Hungary 
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and Spain, said the WWF on 

6 August. The survey echoes an 
assessment from the European 
Commission that found similar 
failings. Illegal logging is a 
leading cause of deforestation 
in tropical forests. 


GMO green light 

The US Department of 
Agriculture says that new 
varieties of genetically 
engineered maize (corn) and 
soya beans will not become 
plant ‘pests’ to other crops. The 
agency's final environmental 
assessment, published on 

6 August, paves the way for 
approval of the first plants 
engineered to be resistant to the 
herbicide 2,4-D. Cotton and 
soya-bean plants engineered 
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to resist the herbicide dicamba 
also passed the assessment. 
The US Environmental 
Protection Agency is still 
reviewing the herbicides to be 
used on the crops. 


EVENTS 


Ballute flight 


A balloon-parachute hybrid 
designed by NASA to slow 
down spacecraft entering 

the thin atmosphere of 

Mars is ready for use, the 
agency announced on 

8 August. The ‘ballute’ was 
one of two re-entry devices 
tested on a 28 June flight 

over the Pacific Ocean 

near Hawaii. The second, a 
supersonic parachute, tore on 
deployment. NASA intends 

to test a redesigned version on 
two more test flights next year. 


Troublesome book 
More than 130 leading 
population geneticists have 
condemned a book that argues 
that genetic variation between 
human populations could 
underlie global economic, 
political and social differences. 
The book, A Troublesome 
Inheritance (Penguin, 2014), 
by science journalist Nicholas 
Wade, uses “incomplete and 
inaccurate explanations” of 
research to support arguments 
about differences among 
human societies, the geneticists 
say ina 10 August letter to The 
New York Times. Wade's book 
was published in May. See 
go.nature.com/ktvblx for more. 


Ebola emergency 
The World Health 
Organization (WHO) 
declared the West African 
Ebola outbreak a public-health 
emergency of international 
concern on 8 August, just 
before deaths reached more 
than 1,000. The outbreak is 
still concentrated in Sierra 
Leone, Guinea and Liberia, 
but Nigeria is also reporting 
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cases. On 11 August, the 
WHO convened a meeting of 
experts to discuss the ethics of 
using experimental medicines 
that have not yet been tested 
in humans. In a statement, 

the panel concluded that “it 

is ethical to offer unproven 
interventions with as yet 
unknown efficacy and adverse 
events, as potential treatment 
or prevention”. Two Americans 
have already received an 
experimental antibody, made 
by Mapp Biopharmaceutical 
of San Diego, California, and 
further doses of the scarce 
drug are to be shipped to 

West Africa. See go.nature. 
com/9ué6sic for more. 


FACILITIES 


Polar power cut 
Research has stalled at the 
British Antarctic Survey's 
Halley Research Station 


TREND WATCH 


An online survey of 1,568 
Chinese consumers suggests 
that demand for shark-fin 
soup is falling, according toa 
4 August report by WildAid, a 


non-governmental organization 


in San Francisco, California. 


The report also polled shark-fin 
vendors in Guangzhou, China, 
who reported declined sales and 
reduced prices, and fishermen in 
Indonesia who said that prices 
had dropped. Wild Aid says that 
awareness campaigns and the 
desire to protect sharks are major 
factors in changing attitudes. 


(pictured) in Antarctica 
after a power failure, the 
organization said ina 
statement on 6 August. Six 
days later, the station's staff 
reported that a coolant 

leak from a main pipe had 
occurred on 30 July, leading 
to generators overheating 
and shutting down. Some 
power and heating has been 
restored, but “all science, 
apart from meteorological 
observations essential for 
weather forecasting, has been 
stopped”. Disrupted work 
includes ozone monitoring, 
meteorology for climate 
science, and studies of the 
upper atmosphere used for 
forecasting space weather. See 
go.nature.com/cjtrpt for more. 


Suez upgrade 

Egypt has announced a 
US$4-billion construction 
project to add an extra channel 


to the Suez canal to allow 
more ships to pass along this 
vital trade route from the Red 
Sea to the Mediterranean. 

The 5-year project involves 
digging or dredging along 

72 kilometres of the canal’s 
163-kilometre length, say 
officials at the Suez Canal 
Authority. On 5 August, 
Egypt's president Abdel Fattah 
el-Sisi said that he hopes to see 
the new waterway opened in 
one year from now. 


PEOPLE 


Biologist dies 

J. Woodland Hastings, who 
helped to found the study 

of circadian rhythms, died 

on 6 August at his home in 
Lexington, Massachusetts. 
He was 87. Hastings studied 
bacterial bioluminescence, 
including its day-night 
rhythms, at Harvard 
University in Cambridge, 
Massachusetts. His research 
also provided early evidence of 
bacterial communication and 
quorum sensing — a system 
that lets species detect and 
respond to stimuli according 
to their population density. 


Scripps leader 


The Scripps Research Institute 
in La Jolla, California, 
announced on 11 August that 
it has appointed one of its 
molecular biologists, James 
Paulson, as acting president. 
The institute — which has 


LOSS OF APPETITE FOR SHARK-FIN SOUP 


A survey finds that Chinese consumer demand for the 
delicacy is falling (as are traders’ prices). 


Have you stopped eating 
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16-21 AUGUST 

The World Weather 
Open Science 
Conference in Montreal, 
Canada, discusses how 
to improve seasonal 
predictions. Topics also 
include the dynamics 
and predictability of 
weather systems such 
as clouds and tropical 
cyclones. 
go.nature.com/7uskg7 


a US$21-million budget 
deficit — is still looking for 

a long-term leader after 

the resignation of Michael 
Marletta in July. Marletta 
had stepped down after 

his plan for a $600-million 
merger between Scripps and 
the University of Southern 
California in Los Angeles 
triggered a faculty revolt (see 
go.nature.com/cvozom). He 
will remain a staff member at 
Scripps. 


Corrupt ivory 

Alllegal sales of ivory should 
be stopped for at least ten years 
because corruption is ruining 
attempts to save African 
elephants, according to a paper 
published on 7 August by the 
Wildlife Conservation Society 
(see E. L. Bennett Conserv. 
Biol. http://doi.org/t5v; 2014). 
The society, based in New 
York, finds that corruption 
among government officials 
in charge of legal ivory 
markets is aggravating 
conservation problems. It 
points out that six of the 

eight countries identified 

as the worst offenders in 

ivory trafficking are in the 
bottom half of a league table 
of honest governance and 
public services drawn up by 
Transparency International 

in Berlin. See go.nature.com/ 
x8jvew for more. 
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to assess pollution caused by 
deep-sea mining p.122 
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Unprecedented drought in California has substantially degraded aquatic habitats. 


Native ecosystems 
blitzed by drought 


California’s current water crisis offers a preview of what 


climate change will bring. 


BY ALEXANDRA WITZE 


roaming California’s streams and rivers 
and gathering data on the fish that live in 
them. But last month he saw something new: 
tributaries of the Navarro River, which rises in 
vineyards before snaking through a redwood 
forest to the Pacific, had dried up completely. 
“They looked in July like they normally look 
in September or October, at the end of the dry 
season, says Moyle, a fish biologist at the Uni- 
versity of California, Davis. 


Pp eter Moyle has seen a lot in five decades of 


Blame the drought. The Navarro and its 
hard-pressed inhabitants are just one example 
of stresses facing a parched state. From the tow- 
ering Sierra Nevada mountains — where the 
snowpack this May was only 18% of the average 
— to the broad Sacramento-San Joaquin river 
delta, the record-setting drought is reshaping 
California's ecosystems. 

It is also giving researchers a glimpse of the 
future. California has always had an extreme 
hydrological cycle, with parching droughts 
interrupted by drenching Pacific storms (see 
‘Extreme hydrology’). But scientists say that the 


current drought — nowin its third year — holds 
lessons for what to expect 50 years from now. 

“The west has always gone through this, but 
we'll be going through it at perhaps a more 
rapid cycle,’ says Mark Schwartz, a plant ecolo- 
gist and director of the John Muir Institute of 
the Environment at the University of Califor- 
nia, Davis. He and others are discussing the 
drought’s ecological consequences at the annual 
meeting of the Ecological Society of America, 
which runs from 10 to 15 August in Sacra- 
mento, California. He says that the state’s plant 
and animal species are at risk in part because 
California ecosystems are already highly modi- 
fied and vulnerable to a variety of stresses. 

Many of the state’s 129 species of native 
inland fish, including several types of salmon, 
are listed by federal or state agencies under 
various levels of endangerment. “We're starting 
from a pretty low spot,’ says Moyle. He hopes 
to use the current drought to explore where 
native fish have the best chances of surviving. 

That could be in dammed streams such as 
Putah Creek near the Davis campus, where 
water flow can be controlled to optimize 
native fish survival. Another focus might be 
on spring-fed streams such as those that flow 
down from volcanic terrain in northernmost 
California and can survive drought much 
longer than snow-fed streams. 

In the late 1970s, Moyle discovered that native 
fish in the Monterey Bay watershed recolonized 
their streams relatively quickly after a two-year 
drought. But today’s streams face greater eco- 
logical pressures, such as more dams and more 
non-native species competing for habitat. 


SPACE INVADERS 

Other challenges arise in the delta where the 
Sacramento and San Joaquin rivers meet, north- 
east of San Francisco. An invasive saltwater clam 
(Potamocorbula amurensis) has taken advantage 
of warming river waters and moved several kilo- 
metres upriver, says Janet Thompson, an aquatic 
ecologist with the US Geological Survey (USGS) 
in Menlo Park, California. 

Potamocorbula out-competes a freshwater 
clam (Corbicula fluminea), and accumulates 
about four times as much of the element sele- 
nium from agricultural run-off and refineries 
as its freshwater cousin does. When endan- 
gered sturgeon feed on Potamocorbula, the fish 
consume much more selenium than is optimal. 
“That's the biggest shift that we've seen that’s of 
environmental concern,” says Thompson. 
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EXTREME HYDROLOGY 


The annual snowmelt and rainfall that feeds 
California's streams and rivers is highly variable. 
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> “These are the kinds of things that can have 
a lasting effect on a predator species.” 

Teasing out the drought’s effects on ter- 
restrial animals is tougher. Researchers have 
documented drops in various California bird 
populations this year, such as mallard ducks 
(Anas platyrynchos) and tricolor blackbirds 
(Agelaius tricolor). But many other factors — 
especially habitat loss — also come into play, so 
it becomes hard to isolate the effects of drought. 

The drought’s effects on larger animals such 
as bears are also uncertain. Anecdotal reports 
suggest that more bears than usual are showing 
up closer to people this year, says Jason Holley, 
a wildlife biologist at the California Department 
of Fish and Wildlife in Rancho Cordova. Within 
the space of six weeks this spring, four black 
bears appeared along the Sacramento River cor- 
ridor, much farther out of the mountains than 
normal. “Those sorts of calls definitely pique 
your interest,” says Holley, who thinks that dry 
conditions in the mountains might be pushing 
bears closer to populated areas. 

The longest-lasting effect could be on 
California's forests, including its iconic giant 
sequoias. The drought has handed forest ecol- 
ogists an unplanned experiment, says Phillip 
van Mantgem, a forestry expert at the USGS 
in Arcata, California, who is speaking at the 
Sacramento meeting. 

Researchers are gathering data to examine 
whether thinning of plots in the forest, in part 
to reduce fire risk, might help trees do better 
under drought. Tests may also help to reveal 
the main mechanisms by which drought kills 
different tree species, whether by interrupting 
the flow of water within the tree or by starving 
it. “Tm really curious to see how this turns out,’ 
van Mantgem says. 

There should be plenty of time to gather 
data. Climatologists expect an El Nifio weather 
pattern to form in the Pacific this year, which 
usually brings more rain and snow to parts of 
California (see Nature 508, 20-21; 2014). But 
the pending El Nifo looks to be weaker than 
first expected, and may not have much, if any, 
influence on ending the drought. Chances are 
that the state will remain dry well into 2015. m 
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Marine communities living near mining targets such as hydrothermal vent fields might be at risk. 


Health check for 
deep-sea mining 


European project evaluates risks to delicate ecosystems. 


BY KATIA MOSKVITCH 


s commercial plans to exploit mineral 
A=" on deep-ocean beds gather 

pace, marine researchers are increas- 
ingly concerned about the damage such 
projects might cause to the sensitive and 
little-understood ecosystems that thrive there. 
Now, scientists are taking to the sea as part ofa 
three-year, €12-million (US$16-million) pro- 
ject designed to address these concerns and to 
develop a set of guidelines for industry. 

The latest research expedition of the 
Managing Impacts of Deep-sea Resource 
Exploitation (MIDAS) programme returned 
to France earlier this month after exploring the 
Lucky Strike region of the Mid-Atlantic Ridge 
near the Azores islands. There, a research team 
began investigating whether plumes of parti- 
cles that might arise from future mining oper- 
ations near hot hydrothermal vents — often 
rich sources of metals — could affect the crea- 
tures that live there, such as deep-sea mussels. 

“The goal of our experiment is to test the 
effects of sulphide particle deposits on the 
structure — composition, density, biomass, 
diversity — of the dominant hydrothermal 
fauna of the Lucky Strike vent field,” says 
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Jozée Sarrazin, a deep-sea ecologist at the 
French Research Institute for Exploitation 
of the Sea (IFREMER) in Plouzané, France, 
who is leading the expedition. “It should help 
us to propose management strategies to 
help protect the unique fauna associated with 
high-temperature emissions on the sea floor” 

Resources such as polymetallic sulphides, 
manganese nodules, cobalt-rich ferromanga- 
nese crusts, methane hydrates and rare-earth 
elements exist in large quantities around deep- 
sea hydrothermal vents, having escaped from 
the molten crust below. The idea of mining 
them was first mooted in the 1960s, but only 
now, with land sources declining and demand 
rising, is it being seriously explored. 

Although no mining projects are yet under 
way, Nautilus Minerals of Toronto, Canada, 
has received a green light from the govern- 
ment of Papua New Guinea to mine about 
50 kilometres offshore in the Bismarck Sea, at 
a depth of 1.6 kilometres. Other concessions 
have been awarded in the eastern Pacific 
Ocean. Nautilus would use sea-floor trawl- 
ers to cut or scoop up the deposits, which are 
then pumped up to a support ship. 

The effects of such mining are cause for con- 
cern. The operations may “severely damage” 
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the sensitive biological communities that live 
near under-sea mountains, hydrothermal 
vents and mineral-rich nodules on the sea 
floor, says David Santillo, a marine biologist 
and senior scientist at Greenpeace Research 
Laboratories at the University of Exeter, 
UK. As well as the physical destruction of 
habitats, he adds, this type of mining could 
smother deep-sea species with suspended 
plumes of sediment. Species could also be 
disturbed by noise, light pollution and expo- 
sure to toxic metals and other chemicals 
released by the mining. 

The severity of such effects depends 
on several factors, including the nature of 
the exploited resource and the method of 
extraction, says oceanographer Cindy Van 
Dover, director of the Duke University 
Marine Laboratory in Beaufort, North Car- 
olina. But her biggest concern is the general 
lack of knowledge about sea-floor processes 
and the cumulative effects of multiple 
mining events. “If we get the environmental 
management wrong, we are unlikely to be 
able to fix our mistake,” she says. 

The MIDAS project, which began in 
November, is receiving €9 million from the 
European Union, and includes representa- 
tives from industry and non-governmental 
organizations. “We will try to identify the 
best ways to monitor before, during and 
after mining to determine the total impact 
and recovery of the ecosystems,” says 
Philip Weaver, managing director of Sea- 
scape Consultants in Romsey, UK, which 
is coordinating MIDAS. 

Cruises to conduct experiments and 
sampling at depth form a core part of the 
project’s work. The IFREMER cruise, on 
the research vessel Pourquoi Pas?, was the 
first stage of a two-year experiment to test 
the effects of sulphide plumes. The research 
team weighed mussels found around hydro- 
thermal vents at a depth of 1.7 kilometres 
and assessed their general health. Next year, 
they will return and mimic the effects of 
particle plumes on the mussels, monitoring 
their reactions — for instance, death, migra- 
tion or increased numbers — with tempera- 
ture sensors and cameras. The results of the 
tests will then be studied on shore. 

A second MIDAS study is currently simu- 
lating potential effects on marine life in the 
shallow waters of Portman Bay off the coast 
of southeastern Spain. An onshore min- 
ing facility dumped waste into the waters 
there for three decades, and the research- 
ers want to assess how the waste affected 
the underwater fauna. “We want to see how 
metal-loaded plumes behave — how far they 
spread, how long it takes for them to settle 
and so on,’ says marine geoscientist Miquel 
Canals Artigas of the University of Barcelona 
in Spain, who is leading the expedition. 

MIDAS will submit its report to the Euro- 
pean Commission in November 2016. m 
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Teen drug use gets 
supersize study 


US government programme will examine 10,000 adolescents 
to document effects on developing brains. 


BY SARA REARDON 


hen the states of Colorado and 
Washington voted to legalize 
marijuana in 2012, the abrupt 


and unprecedented policy switch sent the US 
National Institute on Drug Abuse (NIDA) into 
what its director Nora Volkow describes as “red 
alarm” Although marijuana remained illegal for 
people under the age of 21, the drug’s increased 
availability and growing public acceptance 
suggested that teenagers might be more likely 
to try it (see “Highs and lows’). Almost nothing 
is known about whether or how marijuana 
affects the developing adolescent brain, espe- 
cially when used with alcohol and other drugs. 

The new laws, along with advances in brain- 
imaging technology, convinced Volkow to 
accelerate the launch of an ambitious effort to 
follow 10,000 US adolescents for ten years in 
an attempt to determine whether marijuana, 
alcohol and nicotine use are associated with 
changes in brain function and behaviour. 

Ata likely cost of more than US$300 million, 
it will be the largest longitudinal brain-imaging 
study of adolescents yet. Researchers are eager 
to study a poorly understood period of human 
development — but some question whether it is 
possible to design a programme that will provide 
useful information about the effects of drugs. 

“It’s definitely an idea that’s overdue,’ says 
Deanna Barch, a psychologist at Washington 
University in St. Louis, Missouri. “The down- 
side is it’s a lot of eggs in one basket” 

The exact design of the programme is still in 
flux. In May, NIDA held a planning meeting 
with the National Institute on Alcohol Abuse 
and Alcoholism, the National Cancer Institute 


HIGHS AND LOWS 


Attitudes towards marijuana use among 
US students in grade 12 (aged 17-18) 
have changed dramatically. 
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and the National Institute of Child Health 
and Development (NICHD), which will help 
to fund the project. The partners decided to 
recruit participants at the age of ten. Roughly 
every two years, researchers will image the 
children’s brains, perform psychiatric and 
cognitive tests, and examine factors such as 
genetics and environmental exposures. 

To enlist enough participants likely to use 
drugs, the study will recruit largely from high- 
risk groups, such as children of low socio- 
economic status or those whose parents use 
drugs. Volkow says that the group plans to seek 
input from colleagues at November's Society 
for Neuroscience meeting in Washington DC 
before recruiting researchers for the programme. 

Hugh Garavan, a psychnjneist at the Univer- 
sity of Vermont in Burlington, says that there 
is much to recommend such an analysis. > 
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He and his colleagues reported in July 
— on the basis of a study of 692 European 
teenagers — that they had identified brain 
structures and activity patterns that could 
predict with around 70% accuracy which 
14-year-olds would become binge drink- 
ers by the age of 16 (R. Whelan et al. Nature 
http://doi.org/t5t; 2014). 

A larger study such as NIDA’s is the 
obvious next step, he says, because it will 
help researchers to account for the myriad 
environmental and genetic factors that 
influence development. “I can anticipate 
this becoming a landmark study,’ he adds. 

Even if the larger human study does not 
reveal new information about the effects of 
drugs, it will certainly shed light on the nor- 
mally developing brain, says Lisa Freund, a 
developmental psychologist at the NICHD. 
Volkow says that NIDA plans to make all 
trial data available to researchers, which 
could inspire further studies. 

But recruiting and retaining so many 
participants will be challenging, requiring 
researchers to win the trust of children and 
their families and to ensure that participants 
are not burdened by the drug tests and brain 
scans. The scientists will also need to be flex- 
ible in the face of rapidly improving brain- 
imaging techniques, says Terry Jernigan, 
a cognitive scientist at the University of 
California, San Diego. Whatever technol- 
ogy is chosen is likely to be obsolete by the 
study’s end, she cautions; upgrading tech- 
nology haphazardly could make it difficult 
to compare data across years or study sites. 
Researchers could use both new and old 
imaging methods during transition periods, 
but that could quickly drive up costs. 

Others question whether enough is 
known about the developing brain to identify 
the mechanisms underlying certain specific 
conditions. Some studies have linked ado- 
lescent use of marijuana to psychosis and to 
onset of schizophrenia in those at risk, but it 
is unclear whether this is the case — and if 
so, whether the drug is a trigger or the teen- 
agers are self-medicating. Because research- 
ers cannot control the drug’s timing and 
dosage, it will be hard to resolve that question 
in relation to what is happening in the brains 
of participants who develop psychosis. 

“Tt’s almost impossible to establish direct 
causality” in such cases, Volkow says. Yet she 
hopes that the NIDA study’s huge sample 
size will reveal broader clues, such as differ- 
ences in brain structure between drug users 
and non-users. The institute is considering 
running a parallel brain-imaging study on 
non-human primates to help to address this. 

B. J. Casey, a psychologist at Cornell Uni- 
versity in New York City, hopes that NIDA 
will address the concerns raised by scientists. 
“Once you start something like this, it’s hard 
to stop even if the outcomes aren't telling, 
because people are so invested,” she says. m 
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Aworker sprays insecticide in Haiti to fight mosquitoes that carry chikungunya and other diseases. 


INFECTIOUS DISEASE | 


US assesses virus 
of the Caribbean 


Researchers warn that a change of mosquito host could 
accelerate spread of chikungunya across the Americas. 


BY ALESZU BAJAK 


American airports have been warned that 

travel to the Caribbean might result in 
an unwanted souvenir. The first outbreak of 
chikungunya virus in the Western Hemisphere 
began in the French part of the Caribbean 
island of St Martin in December and has spread 
rapidly around the region, infecting more than 
500,000 people. 

Since then, at least 480 travellers have 
returned to the United States with the mos- 
quito-borne disease, raising concerns that an 
insect biting one of those people would spark a 
US chikungunya outbreak. Yet so far, only four 
locally acquired cases have been confirmed in 
the country, all in southern Florida. The virus 
has gained more of a foothold in Central and 
South America: authorities have confirmed 
174 cases of locally transmitted disease in 
El Salvador, Panama, Costa Rica, Venezuela 
and the Guianas (see “Tropical transfer’). 

For now, the Caribbean strain of chikun- 
gunya does not seem likely to expand into 


[ the past few months, passengers at North 
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the rest of the Western Hemisphere, mostly 
because it is spread by the tropical mosquito 
Aedes aegypti. However, several major chi- 
kungunya outbreaks have been fuelled by a 
specific mutation of the virus that makes it 
more suited to transmission by a different 
species of mosquito — a scenario analysed 
by Carrie Manore, a mathematical epidemi- 
ologist at Tulane University in New Orleans, 
Louisiana, and her colleagues. They report 
that genetic changes in the virus could pro- 
pel chikungunya deep into North and South 
America (C. A. Manore et al. J. Theor. Biol. 
356, 174-191; 2014). The insect that could 
cause the damage is the Asian tiger mosquito 
(Aedes albopictus), which has been expanding 
worldwide for the past two decades and tak- 
ing diseases such as chikungunya and dengue 
with it (see Nature 489, 187-188; 2012). 

Chikungunya was first detected in the 1950s 
in East Africa. It causes fever, severe joint pain 
and, in rare cases, death. Most people recover 
within a week, but painful arthritic symptoms 
can linger for months. 

The Caribbean is fertile ground for the 
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spread of the disease, no matter which 
mosquito is spreading it. In temperate regions, 
winter weather kills A. aegypti mosquitoes 
and acts as a natural brake on the spread of 
the diseases they carry. But in the Caribbean, 
A. aegypti can survive year-round, and serves 
as an outstanding host vector for diseases, 
says Sylvain Aldighieri, a physician at the Pan 
American Health Organization in Washing- 
ton DC who has helped to track the current 
outbreak. 

Native to Africa, A. aegypti had spread 
throughout the warmer zones of the West- 
ern Hemisphere by the seventeenth century. 
It is found across the southern United States, 
and has penetrated as far north as Virginia. In 
mainland South America, says Aldighieri, it 
can be found in every country except Chile. 
Still, the Caribbean might be the only place 
in the hemisphere with the right density of 
mosquitoes and travelling people to enable a 
chikungunya outbreak. 

Scientists are worried about the rapid 
expansion of the Asian tiger mosquito, which 
is more aggressive than A. aegypti and more 
efficient at transmitting chikungunya. During 
a 2005 chikungunya outbreak on the island 
of Réunion, east of Madagascar, A. albopic- 
tus suddenly became a more efficient vec- 
tor because a genetic mutation in the virus 
allowed it to reproduce better in the mosqui- 
to's midgut and to be transmitted more easily. 
The same mutation arose independently in the 
virus on the Indian Ocean island of Mayotte 
in 2006, and again in 2007 when it appeared 
in Madagascar. 

Ifa similar mutated virus strain were intro- 
duced to the Western Hemisphere — or if, as in 
the past, the Caribbean strain were to mutate 
— chikungunya would become a much larger 
public-health concern for the Americas. 
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TROPICAL TRANSFER 


Chikungunya virus has spread around the 
Caribbean since December 2013 (suspected 
plus confirmed cases shown). 
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In North America, the Asian tiger mosquito 
is found in 32 states, from New York to Texas, 
and has been spotted in California, New Mex- 
ico and Arizona. Data for the Southern Hemi- 
sphere are less prevalent and not so reliable, 
says epidemiologist David Morens of the US 
National Institute of Allergy and Infectious 
Diseases in Bethesda, Maryland. However, 
the species is known to be widespread in Latin 
America. 

Manore and her colleagues used a math- 
ematical model to assess the danger of a 
chikungunya outbreak spread by the Asian tiger 
mosquito. It considers rates of susceptibility, 
infectiousness and immunity in both humans 
and mosquitoes to predict how an outbreak 
might evolve over time. The researchers find 
that the relative risk and severity of an outbreak 
depend on the virus—-mosquito combination, 
with the highest risk coming from the Asian 
tiger mosquito carrying the Réunion mutant 
strain of chikungunya. 

“Td be concerned about areas that have both 
Aedes albopictus and Aedes aegypti; Manore 
says. In these regions — where the virus 
could most easily jump to the more aggressive 
species — there is an urgent need for more 
mosquito trapping and studies of how the virus 
and the insect interact, she says. 

Even within countries, different subgroups of 
the same mosquito species sometimes transmit 
the same viral strain differently, which makes 
comprehensive sampling a pressing matter. 
And tracking something as pervasive and 
hard to pin down as a mosquito is no simple 
task, says Erin Staples, a medical epidemiolo- 
gist at the US Centers for Disease Control and 
Prevention in Atlanta, Georgia. “Right now, 
chikungunya is definitely one of our concerns,” 
she says. “How well will it be transmitted? We 
don't know’ m 
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SCIENTISTS [ann THE 


Giant academic 
social networks have 
taken off to a degree 
that no one expected 
even a few years ago. 
A Nature survey 
explores why. 


BY RICHARD VAN NOORDEN 


n 2011, Emmanuel Nnaemeka Nnadi 
needed help to sequence some drug- 
resistant fungal pathogens. A PhD student 
studying microbiology in Nigeria, he did not 
have the expertise and equipment he needed. 
So he turned to ResearchGate, a free social- 
networking site for academics, and fired off a 
few e-mails. When he got a reply from Italian 
geneticist Orazio Romeo, an international col- 
laboration was born. Over the past three years, 
the two scientists have worked together on 
fungal infections in Africa, with Nnadi, now at 
Plateau State University in Bokkos, shipping his 
samples to Romeo at the University of Messina 
for analysis. “It has been a fruitful relationship,” 
says Nnadi — and they have never even met. 
Ijad Madisch, a Berlin-based former physi- 
cian and virologist, tells this story as just one 
example of the successes of ResearchGate, 
which he founded with two friends six years 
ago. Essentially a scholarly version of Facebook 
or LinkedIn, the site gives members a place to 
create profile pages, share papers, track views 
and downloads, and discuss research. Nnadi 
has uploaded all his papers to the site, for 
instance, and Romeo uses it to keep in touch 
with hundreds of scientists, some of whom 
helped him to assemble his first fungal genome. 
More than 4.5 million researchers have 
signed up for ResearchGate, and another 
10,000 arrive every day, says Madisch. That is 
a pittance compared with Facebook's 1.3 bil- 
lion active users, but astonishing for a network 
that only researchers can join. And Madisch 
has grand goals for the site: he hopes that it will 
become a key venue for scientists wanting to 
engage in collaborative discussion, peer review 
papers, share negative results that might never 
otherwise be published, and even upload raw 


data sets. “With ResearchGate we're changing 
science in a way that’s not entirely foreseeable,” 
he says, telling investors and the media that his 
aim for the site is to win a Nobel prize. 

The company now employs 120 people, 
and last June it announced that it had secured 
US$35 million from investors including the 
world’s richest individual, Bill Gates — cash 
that came on top of two earlier rounds of 
undisclosed investment. “It was really a head- 
scratcher when we saw that; says Leslie Yuan, 
who heads a team working on networking 
and innovation software for scientists at the 
University of California, San Francisco. “We 
thought — who are these guys? How are they 
getting so much money?” 


“WE'RE CHANGING 
SCIENCE IN A WAY 
THAT'S NOT ENTIRELY 
FORESEEABLE.” 


XG a 

Yuan is not the only one who has been taken 
aback. A few years ago, the idea that millions of 
scholars would rush to join one giant academic 
social network seemed dead in the water. The 
list of failed efforts to launch a ‘Facebook for 
science included Scientist Solutions, SciLinks, 
Epernicus, 2collab and Nature Network (run 
by the company that publishes Nature). Some 
observers speculated that this was because sci- 


entists were wary of sharing data, papers and 
comments online — or if they did want to share, 
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SOCIAL NETWORK 


they would prefer do it on their own terms, 
rather than through a privately owned site. 

But it seems that those earlier efforts were 
ahead of their time — or maybe they were 
simply doing it wrong. Today, ResearchGate 
is just one of several academic social networks 
going viral. San Francisco-based competitor 
Academia.edu says that it has 11 million users. 
“The goal of the company is to rebuild science 
publishing from the ground up,’ declares chief 
executive Richard Price, who studied philoso- 
phy at the University of Oxford, UK, before he 
founded Academia.edu in 2008, and has already 
raised $17.7 million from venture capitalists. A 
third site, London-based Mendeley, claims 3.1 
million members. It was originally launched as 
software for managing and storing documents, 
but it encourages private and public social net- 
working. The firm was snapped up in 2013 by 
Amsterdam-based publishing giant Elsevier for 
a reported £45 million (US$76 million). 


WINNING FORMULA 

Despite the excitement and investment, it is 
far from clear how much of the activity on 
these sites involves productive engagement, 
and how much is just passing curiosity — or 
a desire to access papers shared by other users 
that they might otherwise have to pay for. “I've 
met basically no academics in my field with a 
favourable view of ResearchGate,’ says Daniel 
MacArthur, a geneticist at Massachusetts Gen- 
eral Hospital in Boston. 

In an effort to get past the hype and explore 
what is really happening, Nature e-mailed tens 
of thousands of researchers in May to ask how 
they use social networks and other popular 
profile-hosting or search services, and received 
more than 3,500 responses from 95 different 
countries. 

The results confirm that ResearchGate is 


certainly well-known (see ‘Remarkable reach, 
and full results online at go.nature.com/jvx7pl). 
More than 88% of scientists and engineers said 
that they were aware of it — slightly more than 
had heard of Google+ and Twitter — with little 
difference between countries. Just under half 
said that they visit regularly, putting the site 
second only to Google Scholar, and ahead 
of Facebook and LinkedIn. Almost 29% of 
regular visitors had signed up for a profile on 
ResearchGate in the past year. 

This does not surprise Billie Swalla, an evolu- 
tionary biologist and director of the University 
of Washington's Friday Harbor Laboratories. 
Swalla says that she and most of her colleagues 
are on ResearchGate, where she finds the lat- 
est relevant papers much more easily than by 
following marine-biology journals. “They do 
send youa lot of spam,’ she says, “but in the past 
few months, I’ve found that every important 
paper I thought I should read has come through 
ResearchGate.” Swalla admits to comparing 
herself to others using the site's ‘RG Score’ — 
its metric of social engagement. “I think it taps 
into some basic human instinct;’ she adds. 


TACTICAL BREAKDOWN 
Some irritated scientists say that the site taps 
into human instincts only too well — by regu- 
larly sending out automated e-mails that pro- 
fess to come from colleagues active on the site, 
thus luring others to join on false pretences. 
(Indeed, 35% of regular ResearchGate users 
in Nature’s survey said that they joined the 
site because they received an e-mail.) Lars 
Arvestad, a computer scientist at Stockholm 
University, is fed up with the tactic. “I think 
it is a disgraceful kind of marketing and I am 
choosing not to use their service because of 
that,” he says. Some of the apparent profiles 
on the site are not owned by real people, but 
are created automatically — and incompletely 
— by scraping details of people’s affiliations, 
publication records and PDFs, if available, 
from around the web. That annoys researchers 
who do not want to be on the site, and who feel 
that the pages misrepresent them — especially 
when they discover that ResearchGate will not 
take down the pages when asked. Madisch is 
unruffled by these complaints. The pages are 
marked for what they are, and are not counted 
among the site’ real users, he says, adding: “We 
changed many things based on the feedback we 
got. But the criticism is relatively small, relative 
to the number of people who like the service.” 
Academia.edu seems less well-known than 
ResearchGate: only 29% of scientists in the 
survey were aware of it and just 5% visited regu- 
larly. But it has its fans — among them climate 
scientist Hans von Storch, director of the Insti- 
tute for Coastal Research 


in Geesthacht,Germany, NATURE.COM 
who uses the site to share —_ Foraninteractive 
not only his papers, but —__graphicandmoreon 


profile-managing, see: 
go.nature.com/fjvxxt 


also his interviews, book 
reviews and lectures. 


REMARKABLE REACH 


FEATURE 


More than 3,000 scientists and engineers (below) told Nature about their awareness of various giant social 
networks and research-profiling sites. Just under half said that they visit ResearchGate regularly. Another 
480 respondents in the humanities, arts and social sciences were less keen on ResearchGate — see charts at 


go.nature.com/fjvxxt. 
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Price points out that Academia.edu has much 
higher web traffic than ResearchGate overall, 
perhaps because — unlike its rival — it is open 
to anyone to join. And for the 480 social sci- 
ence, arts and humanities researchers included 
in Nature’s survey, usage of the two sites was 
more closely matched. 


“WE HAVE TO BUILD BETTER 
FILTER SYSTEMS 10 
EXPLAIN WHAT RESEARCH 
YOU CAN TRUST.” 


cugcgrg 


High numbers by themselves do not mean 
much, says Jan Reichelt, a co-founder of Mende- 
ley (which scored 48% awareness and 8% regu- 
lar visitors among scientists in Nature's survey). 
“We've moved away from mentioning ‘start-up 
vanity metrics’ as the key number, he says. “It 
doesn't tell you about the quality of interaction” 

To get arough measure of that quality, Nature 
asked a subset of the most active respondents 
what they actually do on the sites they visit 
regularly (see ‘Idle, browse or chat?’). The 
most-selected activity on both ResearchGate 
and Academia.edu was simply maintaining a 
profile in case someone wanted to get in touch 
— suggesting that many researchers regard 
their profiles as a way to boost their professional 
presence online. After that, the most popular 
options involved posting content related to 
work, discovering related peers, tracking met- 
rics and finding recommended research papers. 


“These are tools that people are using to raise 
their profiles and become more discoverable, 
not community tools of social interaction,” 
argues Deni Auclair, a lead analyst for Outsell, a 
media, information and technology consulting 
firm in Burlingame, California. By comparison, 
Twitter, although used regularly by only 13% 
of scientists in Nature’s survey, is much more 
interactive: half of the Twitterati said that they 
use it to follow discussions on research-related 
issues, and 40% said that it is a medium for 
“commenting on research that is relevant to my 
field” (compared with 15% on ResearchGate). 


PAPERS PLEASE 
Laura Warman, an ecologist at the Institute of 
Pacific Islands Forestry in Hilo, Hawaii, echoes 
the views of many when she says that she has 
uploaded papers on Academia.edu to keep track 
of how often, where and when they are down- 
loaded. “I find it especially intriguing that my 
most downloaded paper is not my most cited 
work,’ she says. “To put it bluntly, I have no idea 
if these sites have any impact whatsoever on my 
career — I tend to doubt they do — but I enjoy 
knowing that my work is being discussed.” 
Price says that 3 million papers have been 
uploaded to Academia.edu, and Madisch says 
that 14 million are accessible through Research- 
Gate (although he will not say how many of 
those have been automatically scraped from 
freely accessible places elsewhere). An unpub- 
lished study conducted by computer scientists 
Madian Khabsa at Pennsylvania State Univer- 
sity in University Park and Mike Thelwall at 
the University of Wolverhampton, UK, sug- 
gests that by August this year, the full texts of 
around one-quarter of all molecular-biology 
papers published in 2012 were available from 
ResearchGate. That said, these days papers are 
easily found on many sites: a study conducted 
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IDLE, BROWSE OR CHAT? 


Nature asked a subset of regular visitors to social networks how they used the sites professionally. (Each person was asked 
to tick all activities that applied.) The results suggest that Facebook is not widely used professionally; that researchers on 
Twitter are very active and social; and that many users of Academia.edu and ResearchGate signed up in case someone 
wants to contact them — but are not chatty themselves. Full results are available at go.nature.com/jvx7pl. 
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How do you 
use this site 
professionally? 


Approximately 
how often do 
you visit this 
site ina 
professional 
capacity? 


How long 
have you had 
a profile on 
this site? 


RESEARCHGATE 


1,589 regular visitors 


283 regular visitors 
1 


ACADEMIA.EDU 


198 regular visitors 


1 


\. 


“Mainly, a source of stress every time an 
e-mail pops in showing that my colleagues/ 
competitors are publishing more than myself.” 
AGE 35-44, PROFESSOR, BRAZIL 


“It is the only useful community website for 
research purposes.” 
AGE 55-64, PROFESSOR, HUNGARY 


“Has led to invites to referee papers/external 
assessments.” 

AGE 45-54, POSTDOCTORAL FELLOW, 
UNITED KINGDOM 


“| have been able to post old papers which 
otherwise would be inaccessible to people.” 
AGE 55-64, PROFESSOR, UNITED STATES 


“Primarily still a reference manager for me. 
The social component is less important.” 
AGE 35-44, RESEARCH SCIENTIST, CANADA 


“Fairly useful as a documents clearinghouse 
for lab group.” 

AGE 25-34, POSTDOCTORAL FELLOW, 
UNITED STATES 


for the European Commission last year found 
that 18% of biology papers published in 2008- 
11 were open access from the start, and said that 
57% could be read for free in some form, some- 
where on the Internet, by April 2013 (see Nature 
500, 386-387; 2013). 

Publishers are worried that the sites could 
become public troves of illegally uploaded con- 
tent. In late 2013, Elsevier sent 3,000 notices 
to Academia.edu and other sites under the US 
Digital Millennium Copyright Act (DMCA), 
demanding that they take down papers for 
which the publisher owned copyright. Aca- 
demia.edu passed each notice on to its users 
—a decision that triggered a public outcry. One 
researcher who received a take-down request 
did not want to be named, but told Nature: “I 
hardly know any scientists who dont violate 
copyright laws. We just fly below the radar and 
hope that the publishers don't notice” 


These concerns are not unique to large social 
networks, says Price; the same issue surrounds 
content posted in universities’ online reposito- 
ries (to which Elsevier also sent some DMCA 
notices last year). “This is really part of the wider 
battle where academics want to share their 
papers freely online, whereas publishers want 
to keep content behind a paywall to monetize it? 
he says, noting the nuance that many publishers 
allow researchers to upload the final accepted 
version of a manuscript, but not the final PDE. 
He has seen fewer take-down notices this year. 


OPEN INTENTIONS 

Giant social networks could also disrupt the 
research landscape by capturing other pub- 
lic content. In March this year, Research- 
Gate launched a feature called Open Review, 
encouraging users to post in-depth critiques 
of existing publications. Madisch says that 
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members have now contributed more than 
10,000 such reviews. “I believe that this is just 
the tip of the iceberg,” he says. He wants users 
to upload raw data sets too — including, per- 
haps, negative results that might otherwise 
never be published — and says that 700 are 
appearing on the site each day. 

At Academia.edu, Price is planning to 
launch a post-publication peer-review feature 
as well. “We have to build better filter systems 
to explain what research you can trust,’ he says. 

Few would argue with these goals, but many 
wonder why researchers would deposit their 
data sets and reviews on these new social net- 
works, rather than elsewhere online — on their 
own websites, for example, in university repos- 
itories, or on dedicated data-storage sites such 
as Dryad or figshare (see Nature 500, 243-245; 
2013 — figshare is funded by Nature’s parent 
company, Macmillan Publishers). 'To Madisch, 


Each wedge in the circular charts 
corresponds to a question on the 
right. The answers are grouped by 
the intensity of user engagement 


they imply: low (green), medium 3. 


(yellow) and high (blue). 


330 regular visitors 


ul 
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“Extremely useful in conference settings.” 
AGE 35-44, RESEARCH SCIENTIST, 
UNITED STATES 


“Great way to keep up-to-date on what is 
happening NOW in the research community.” 
AGE 45-54, HEAD OF ACADEMIC 
DEPARTMENT, UNITED STATES 


the answer lies with the social sites’ burgeon- 
ing communities of users — the famed ‘net- 
work effect. “If you post on ResearchGate, you 
are reaching the people who matter,” he says. 
But Titus Brown, a computational scientist at 
Michigan State University in East Lansing, 
is concerned about the sites’ business plans 
as they seek to survive. “What worries me is 
that at some point ResearchGate will use their 
information to make a profit in ways that we 
are uncomfortable with — or they will be 
bought by someone who will do that,” he says. 

Madisch says that ResearchGate will not sell 
its user data, and that it already makes some 
money by running job adverts (as does Aca- 
demia.edu). In the future, he hopes to add a 
marketplace for laboratory services and prod- 
ucts, connecting companies and corporate 
researchers to academics (28% of the network’s 
users are from the corporate world, he says). 


“Mainly useful for job hunting.” 
AGE 25-34, PHD STUDENT, UNITED STATES 


“It is too much like Facebook — fluffy forwards 
and such that are not scientific or related to 
professionalism.” 

AGE 45-54, ASSOCIATE PROFESSOR, 
UNITED STATES 
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Price talks about providing institutional analyt- 
ics to universities as well. But analysts including 
Auclair argue that the sites have limited earning 
potential, because they are targeted at a much 
narrower demographic than Facebook or Twit- 
ter. “What's most likely is the networks that have 
critical mass get acquired and those that don't 
will die,” she says (although Madisch says that 
being bought out “would be a personal failure”). 

Mendeley’s acquisition by Elsevier last 
year left the site better placed to become a 
global platform for research collaboration, 
says Reichelt, because it intersects with 
other Elsevier products such as the Scopus 
database of research articles. Much of the 
collaboration done using Mendeley is pri- 
vate, but the firm does allow other computer 
programs to automatically pull out useful 
anonymized public information — such 
as which papers are viewed most by which 


Discover recommended 
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“Facebook has zero credibility in my 
professional life.” 
AGE 35-44, STAFF SCIENTIST, 


“The (invitation-only) groups for professional 
astronomers and pulsar astronomers have 
become vibrant discussion fora.” 

35-44, RESEARCH SCIENTISTS, 


researchers. Neither Academia.edu nor 
ResearchGate yet offer this service, although 
Madisch says that he is developing it. 

“T think at some point there will be one 
winner in this race,” says Madisch. Or — as 
Nature’s survey suggests is already happening 
— different disciplines might favour different 
sites. Some analysts argue that despite their 
millions of users, massive social academic net- 
working sites have not yet proven their essen- 
tial worth. “They are nice-to-have tools, not 
need-to-have,” says Auclair. But Price says that 
the networks are on the front line of a trend 
that cannot be ignored. “We saw the changes 
in the market, and we could see that academics 
wanted to share openly. The tide is starting to 
turn in our direction.” m 


Richard Van Noorden is a senior reporter for 
Nature in London. 
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there is an abiding fascination with how 

the experiences of pregnant women 
imprint on their descendants. The latest 
wave in this discussion flows from studies of 
epigenetics — analyses of heritable changes 
to DNA that affect gene activity but not 
nucleotide sequence. Such DNA modifica- 
tion has been implicated in a child’s future 
risk of obesity, diseases such as diabetes, and 
poor response to stress. 

Headlines in the press reveal how these 
findings are often simplified to focus on 
the maternal impact: ‘Mother’s diet dur- 
ing pregnancy alters baby’s DNA (BBC), 
‘Grandma's Experiences Leave a Mark on 
Your Genes’ (Discover), and ‘Pregnant 9/11 
survivors transmitted trauma to their chil- 
dren (The Guardian). Factors such as the 
paternal contribution, family life and social 
environment receive less attention. 

Questions about the long shadow of the 
uterine environment are part of a burgeon- 
ing field known as developmental origins of 
health and disease (DOHaD)'. For exam- 
ple, one study revealed’ that 45% of children 
born to women with type 2 diabetes develop 
diabetes by their mid-twenties, compared 
with 9% of children whose mothers devel- 
oped diabetes after pregnancy. 

DOHaD would ideally guide policies 
that support parents and children, but 
exaggerations and over-simplifications are 
making scapegoats of mothers, and could 
even increase surveillance and regulation of 
pregnant women. As academics working in 
DOHaD and cultural studies of science, we 
are concerned. We urge researchers, press 
officers and journalists to consider the rami- 
fications of irresponsible discussion. 


FE rom folk medicine to popular culture, 


ALARMING PRECEDENTS 
») There is a long history of society blaming 
mothers for the ill health of their children. 
D O | C bl ad | | I e Preliminary evidence of fetal harm has led 
to regulatory over-reach. First recognized in 
the 1970s, fetal alcohol syndrome (FAS) is a 
collection of physical and mental problems in 
e | I O e T S children of women who drink heavily during 
pregnancy. In 1981, the US Surgeon General 
advised that no level of alcohol consump- 


Careless discussion of epigenetic research on how tion was safe for pregnant women. Drink- 
early life affects health across generations could harm pins daein Jc oe Rilaapanpesaniann 
. even criminalized. Bars and restaurants were 

women, Wart Sarah S. Richardson and colle agues. required to display warnings that drinking > 
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> causes birth defects. Many moderate 
drinkers stopped consuming alcohol during 
pregnancy, but rates of FAS did not fall’. 
Although those who drink heavily dur- 
ing pregnancy can endanger their children, 
the risks of moderate drinking were over- 
stated by policy-makers — a point recently 
reaffirmed by the Danish National Birth 
Cohort study, which did not find adverse 
effects in children whose mothers drank 
moderately during pregnancy*. Nonetheless, 
warnings about alcohol during pregnancy 
made in inappropriate contexts still cause 
pregnant women to suffer social condem- 
nation and to agonize over an occasional sip. 
In the 1980s and 1990s, surging use of crack 
cocaine (a smokable form of the drug) in the 
United States led to media hysteria around 
‘crack babies’ — those who had been exposed 
to cocaine in the womb. Pregnant women 
who took drugs lost social benefits, had their 
children taken away and were even sent to 
prison. More than 400 pregnant women, 
mostly African American, have been pros- 
ecuted for endangering their fetuses in this 
way. Exposed infants were stigmatized as a 
biologically doomed underclass. Today, fetal 
exposure to crack or cocaine is considered 
no more harmful than exposure to tobacco 
or alcohol’, but criminal prosecution of preg- 
nant women who take such drugs continues. 
Previous generations found other ways to 
blame women. As late as the 1970s, ‘refrigera- 
tor mothers’ (a disparaging term for a parent 
lacking emotional warmth) were faulted for 
their children’s autism. Until the nineteenth 
century, medical texts attributed birth 
deformities, mental defects and criminal ten- 
dencies to the mother’s diet and nerves, and 
to the company she kept during pregnancy. 
Although it does not yet go to the same 
extremes, public reaction to DOHaD 
research today resembles that of the past in 
disturbing ways. A mother’s individual influ- 
ence over a vulnerable fetus is emphasized; 
the role of societal factors is not. And studies 
now extend beyond substance use, to include 
all aspects of daily life. 


CONTEXT IS KEY 
A 2013 story on the health-information 
website WebMD demonstrates the sort of 
responsible reporting that we would like 
to see more of (see go.nature.com/p2krhs). 
The story reported findings of a four-fold 
increased risk of bipolar disorder in adult 
offspring ifa mother had influenza during 
pregnancy’, but it emphasized that the over- 
all risk observed was small and that bipolar 
disorder is treatable. It stated that the study 
considered only one of many possible risk 
factors and did not establish cause and effect. 
Furthermore, the headline did not lead with 
the scary number. 

Much less context was given in cov- 
erage of a 2012 paper’ showing that 


second-generation offspring of rats eating 
a high-fat diet during pregnancy had an 80% 
chance of cancer, compared with 50% of 
control rats. “Why you should worry about 
grandmas eating habits; read one headline. 
“Think twice about that bag of potato chips 
because you are eating for more than two,” 
warned another story. These articles did not 


state that the rats were 
“We urge bred for high cancer 
scientists, rates. Nor did they 
educators include inconsistent 
andreporters results: third-gen- 
to anticipate eration offspring of 
how this work female rats on high- 
is likely to be fat diets actually had 
interpreted lower incidences of 
in popular tumours than their 


control peers. 

Inadequately sup- 
ported and poorly contextualized statements 
are also found in well-intentioned educational 
materials. The website beginbeforebirth.org, 
put together by researchers at Imperial Col- 
lege London, advocates ways to “support and 
look after pregnant womem A video on the 
website portrays a 19-year-old released from 
prison after a stint for looting (see go.nature. 
com/wynfzw). “Perhaps his problems stretch 
right back to the womb,’ the narrator says. 
“Could better care of pregnant women bea 
new way of preventing crime?” At best, such 
suggestions overstate conclusions of current 
research. 


discussions.” 


BEYOND THE MATERNAL IMPRINT 

Today, an increasing segment of DOHaD 
research recognizes that fathers and grand- 
parents also affect descendants’ health. Stud- 
ies suggest that diet and stress modify sperm 
epigenetically and increase an offspring’s 
risk of heart disease, autism and schizophre- 
nia. In humans, the influence of fathers over 
mothers’ psychological and physical state 
is increasingly recognized. So are effects of 
racial discrimination, lack of access to nutri- 
tious foods and exposure to toxic chemicals 
in the environment. 

Viewed from this broader perspective, 
DOHaD provides a rationale for policies to 
improve the quality of life for women and 
men. It must not be used to lecture individ- 
ual women, as in a 2014 news report from 
the US media organization National Pub- 
lic Radio on an epigenetics study in mice: 
“Pregnancy should be a time to double- 
down on healthful eating if you want to 
avoid setting up your unborn child for a 
lifetime of wrestling with obesity.” How are 
women who lack time or access to healthy 
foods to act on such advice? 

We urge scientists, educators and report- 
ers to anticipate how DOHaD work is likely 
to be interpreted in popular discussions. 
Although no one denies that healthy behav- 
iour is important during pregnancy, all those 


132 | NATURE | VOL 512 | 14 AUGUST 2014 


© 2014 Macmillan Publishers Limited. All rights reserved 


involved should be at pains to explain that 
findings are too preliminary to provide rec- 
ommendations for daily living. 

Caveats span four areas. First, avoid 
extrapolating from animal studies to 
humans without qualification. The short 
lifespans and large litter sizes favoured for 
lab studies often make animal models poor 
proxies for human reproduction. Second, 
emphasize the role of both paternal and 
maternal effects. This can counterbalance 
the tendency to pin poor outcomes on 
maternal behaviour. Third, convey com- 
plexity. Intrauterine exposures can raise or 
lower disease risk, but so too can a plethora 
of other intertwined genetic, lifestyle, socio- 
economic and environmental factors that 
are poorly understood. Fourth, recognize 
the role of society. Many of the intrauterine 
stressors that DOHaD identifies as having 
adverse intergenerational effects corre- 
late with social gradients of class, race and 
gender. This points to the need for societal 
changes rather than individual solutions. 

Although remembering past excesses of 
‘mother-blame’ might dampen excitement 
about epigenetic research in DOHaD, it 
will help the field to improve health without 
constraining women’s freedom. = 
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Wonder maker 


Andrew Robinson delves into a study inspired by James Watt’s fascinating workshop. 


acquired the entire workshop of engineer 
James Watt, left almost untouched in the 
attic of his house in Birmingham, UK, since 
his death more than a century before. The 
museum put a recreation of the workshop 
on permanent display in 2011. Among the 
8,434 items left by the Scotsman, best known 
for his innovative steam engine, is an enor- 
mous range of tools, including the earliest 
known circular saws. There are also math- 
ematical instruments, optical experiments, 
minerals and chemicals, pottery and ceram- 
ics made by Watt, busts of famous figures 
waiting to be copied in plaster of Paris, and 
engine-related objects — such as a box con- 
taining the fragments of his attempts to make 
an engine that used pure rotary motion. 
This workshop inspired Ben Russell, the 
Science Museum's curator of mechanical 
engineering, to write his engaging James 
Watt: Making the World Anew. He explains 
that the volume of material, “crossing the 
boundaries between philosophy and craft, 


E 1924, London’s Science Museum 


makes it hard to cat- James Watt: 
egorize the contents Making the World 
against any one of sesh ; 

. BEN RUSSEL 
the labels which have Reaktion: 2014. 
been applied to Watt 


over time: philosopher or craftsman pri- 
marily, but engineer and chemist, as well.” 
The diversity of Watt’s interests and activi- 
ties was astonishing, even when compared 
with the achievements of his Enlightenment 
contemporaries. Chemist, inventor and 
Royal Society president Humphry Davy, for 
instance, called him a “modern Archimedes” 
whose inventions had made industrialized 
Britain remarkably powerful for such a small 
nation. 

Watt’s first steam engine, which began 
operating in 1776, was successful because 
it had three times the coal-combustion 
efficiency of the existing engine designed 
by Thomas Newcomen, introduced in 
1712. The steam cylinder in Newcomen’s 
‘atmospheric engine had to be sprayed 
with cold water at each cycle to condense 
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the steam, creating a partial vacuum that 
allowed atmospheric pressure to push the 
piston down. In 1765, in Glasgow, Watt had 
a “major leap of imagination’, as Russell puts 
it: the idea of building a separate condenser, 
so that cylinder and piston did not lose heat. 
By patenting the principles of the condenser 
and not the means of applying them, Watt 
and his business partner Matthew Boulton 
became wealthy, although not without along 
legal battle against their rivals in the 1790s. 
Their engine — its power defined in horse- 
power, a unit invented by Watt and today 
most commonly converted as 746 watts — 
became an industry standard by 1800, for 
pumping water from mines and driving 
machinery in mills and factories. 

From 1804, Watt moved from steam to 
sculpture, creating plaster of Paris cop- 
ies of busts, then much in demand among 
the wealthy. His ‘sculpture machine’ was a 
three-dimensional pantograph, powered 
by a treadle and worked by means of linked 
and geared arms, one ending in a probe 


CHRISTIE'S IMAGES/CORBIS 
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and one in a high-speed, rotating cutting Q q 

tool. As the probe traced the surface of the B 0 0 ks | | bh rl ef 
original bust, the tool duplicated its motion 

and cut a plaster block. Today, about 400 of 
Watt's sculptures are in storage at the Science 
Museum, including casts, busts, depictions 
of contemporaries including the chemist 
Joseph Black, and copies of Boulton’s 1809 
death mask. After his own death in 1819, 
Watt became the first engineer to be com- 
memorated in Westminster Abbey. For the 
Victorians, Russell shows, Watt was “a new 
kind of industrial hero” whose stature was 
comparable to Isaac Newton’s as a physicist. 


The Social Roots of Risk: Producing Disasters, 

Promoting Resilience 

Kathleen Tierney STANFORD UNIVERSITY PRESS (2014) 

The origins of disaster lie in “the ordinary everyday workings of 
society”, avers sociologist Kathleen Tierney in this brilliant treatise. 
Drawing on a trove of timely case studies, Tierney analyses how 
factors such as speculative finance and rampant development allow 
natural and economic blips to tip more easily into catastrophe. 
Resilience, she argues, is rooted in sustainable ecological and social 
development. It is transformative risk reduction, not bailouts, that 


As Russell admits, there is no shortage of will help humanity to weather coming upheavals. 
recently published studies of Watt, such as 
= Richard Hills’s three- 

Watt was volume biography = Happiness by Design: Change What You Do, Not How You Think 
‘anew kind James Watt (Land- amy Paul Dolan HUDSON STREET (2014) 
of industrial mark, 2002-06) and Happiness The science of happiness has been with us since at least the 1940s, 
hero’ whose James Watt, Chemist Y Vesign when Abraham Maslow’s ideas opened up a psychology based 
stature was by David Miller Change on feeding the potential for positivity rather than simply treating 
comparable to (Pickering & Chatto, not i symptoms. To this now-crowded table, behavioural scientist Paul 
Isaac Newton’s 2009). But where Rus- think Dolan brings a feast of US and European research, and some 
as aphysicist. » — sell focuses on Watt | significant insights. Dolan argues that happiness depends on where 

asa man able “not just ——. we focus our attention, and on how well we balance purpose and 

to think but to do: to use tools, techniques J pleasure. His action-oriented outline for achieving that equilibrium 
and materials, to create tangible things draws in part on work with eminent psychologist Daniel Kahneman. 


across a range of activities”, most studies 
tend to emphasize his capacity as a thinker. 
Perhaps that tendency is inevitable. Scien- 
tists and science historians generally revere 
original theories with unforeseeable con- 
sequences more than practical inventions 
with immediate applications — Newton 
and Albert Einstein more than Christopher 
Wren, Watt and Thomas Edison. For all the 
wonderful creativity on display in his work- 
shop, Watt was essentially earthbound. Yet 
his life and work are decidedly relevant to 
the debate about how scientific discoveries 
are best turned into marketable inventions. 
Watt’s way of working — with a business 
partner and a patentable purpose, whether 
an efficient coal-driven means of pumping 
flood water out of mineshafts or the mass 
production of pottery — could hold lessons 
for any university or government keen to 
promote technology transfer. 

Watt was born in Scotland, trained as an 
instrument maker in England, made his 


Great Minds: Reflections of 111 Top Scientists 

Balazs Hargittai, Magdolna Hargittai and Istvan Hargittai 

OXFORD UNIVERSITY PRESS (2014) 

Over two decades, chemists Balazs, Magdolna and Istvan Hargittai 
interviewed hundreds of prominent scientists, including 68 Nobel 
laureates. This distillation features excerpts from 111 of these frank 
conversations. Featured are mathematician John Conway on how his 
discovery of surreal numbers was like finding a palace after drifting 
around a strange city; physicist Gerard ‘t Hooft on the improbability 
of intelligent extraterrestrials; and physicist Mildred Dresselhaus, 
biologist Francis Crick, and more on the fascination of the life scientific. 


Working Stiff: Two Years, 262 Bodies, and the Making of a 

Medical Examiner 

Judy Melinek and T. J. Mitchell SCRIBNER (2014) 

“A hard hat was still there, lying on its side in a pool of blood 

and brains, coffee and doughnuts.” Judy Melinek’s inside story 

on forensic-pathology training, written with her husband, writer 

T. J. Mitchell, is inevitably big on gore. But Melinek, a “sunny optimist”, 
offers more than cheap thrills. The flamboyant disclosures — how 
breakthrough with the steam engine in Scot- to handle rotting flesh or use pruning shears to snap ribs — are 

land, and began manufacturing it in Eng- balanced by her soul-baring account of identifying human remains in 
land, where he settled. Next month, there the wake of the terrorist attacks in New York on 11 September 2001. 
will be a referendum on Scottish independ- 
ence from the United Kingdom. Whatever 
the outcome, Watt’s remarkable life is a defi- 
nite benefit arising from the close economic, 
intellectual and cultural union of Scotland 
and England. = 


The Wastewater Gardener: Preserving the Planet One 

Flush ata Time 

Mark Nelson SYNERGETIC (2014) 

It takes 1,000 tonnes of water to move 1 tonne of human faeces, 
notes engineer Mark Nelson. His alternative to costly, unsustainable 
sanitation is constructed wetland — subsurface-flow gravel beds 

in which plant roots and microbial action purify wastewater for a 
full range of uses. Nelson, a veteran of the 1990s US survivability 
experiment Biosphere 2, has built “wastewater gardens” from 
Algeria to Australia, Mexico and beyond. Barbara Kiser 


Andrew Robinson is the author of The Last 
Man Who Knew Everything — a biography 
of the polymath Thomas Young — and 
editor of The Scientists. 

e-mail: andrew.robinson33@virgin.net 


14 AUGUST 2014 | VOL 512 | NATURE | 135 
©) 2014 Macmillan Publishers Limited. All rights reserved 


Correspondence 


Russian stamp to 
honour physicist 


Russia has just issued a postage 
stamp to mark the centenary 

of the birth of the brilliant 
physicist and cosmologist Yakov 
Zeldovich (1914-87). 

Among his many 
achievements, and despite never 
having received a university 
degree, Zel'dovich developed 
the theories of nuclear chain 
reactions and of the gravitational 
lens (see R. A. Sunyaev (ed.) 
Zeldovich: Reminiscences 
Chapman & Hall/CRC; 2004). 

Asa theoretician, he was 
involved in the creation of 
Soviet nuclear weapons — the 
atomic bomb in 1949, with Lev 
Landau, and the hydrogen bomb 
in 1953, with Andrei Sakharov. 
Like Robert Oppenheimer in 
the United States, he met with 
government opposition when he 
declined to continue working on 
weapons development. 

Moving over to astrophysics, 
Zeldovich made seminal 
contributions in gravitational 
instabilities and in cosmological 
fluctuations, with the Sunyaev— 
Zel'dovich effect being among 
the best known. In 2001, an 
asteroid, 11438 Zeldovich, was 
named in his honour. 

Renad I. Zhdanovy Kazan 
Federal University; and Sholokhov 
Moscow State University for the 
Humanities, Russia. 

Pascal Chardonnet University of 
Savoie, Annecy, France. 
zrenad@gmail.com 


White possums must 
stay cool to survive 


It is ironic that Australia, one 
of the world’s highest carbon 
emitters per capita, is giving up 
on a hard-won plan to reduce 
its greenhouse-gas emissions 
(see Nature 511, 392; 2014) 
just as climate change could be 
about to claim one of its rarest 
and most iconic animals — the 
white lemuroid ringtail possum 
(Hemibelideus lemuroides). 
The white possum was once 


A.B. 3ESIBNOBUY 


1914-1987 


abundant in cool rainforests 

on Mount Lewis in northern 
Queensland, but its population 
collapsed abruptly following a 
severe heatwave in 2005. Today, 
just a handful of individuals are 
left (see J. Chandler New Scientist 
Issue 2980, 42-45; 2014). 

Tropical mountains are full 
of endemic species that have 
adapted to cooler local climates 
and are particularly vulnerable 
to heatwaves and other extreme 
weather associated with climate 
change (see go.nature.com/ 
vqskfa). 

It has therefore been suggested 
that the white possum might 
bea more sensitive indicator of 
climate change than the polar 
bear (W. Laurance New Scientist 
Issue 2690, 14; 2009). But first 
the temperature of its habitat 
must be stabilized so that its 
numbers can be restored. 
William FE. Laurance, Susan 
Laurance James Cook University, 
Cairns, Australia. 
bill. laurance@jcu.edu.au 
Christine Milne Parliament 
House, Canberra, Australia. 


Mexican GM maize 
rift is not so simple 


You rightly point out that the issue 
of genetically modified (GM) 
maize (corn) is more sensitive and 
complex in Mexico than in other 
countries (Nature 511, 16-17; 
2014), but you owe readers a more 
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in-depth and balanced view. 

The rift in Mexico’s scientific 
community over GM maize is 
not directly related to the legal 
challenge you discuss. It is a result 
of the commercial push to plant 
GM maize before the benefits and 
risks, and the costs to Mexican 
society, have been fully assessed. 

The possibility of producing 
maize that is tolerant to drought 
and frost, a claim you report from 
government-funded researchers, 
could indeed help to restore 
Mexico’ capacity for growing its 
own maize. However, commercial 
cultivars in Mexico (25% of 
total area) have limited reach, 
even after more than 60 years 
of breeding (see, for example, 

S. Brush and H. Perales Agr. 
Ecosyst. Environ. 121, 211-221; 
2007). More than two million 
households rely on traditional 
landraces for food security 

(H. Eakin et al. Dev. Change 

45, 133-155; 2014), and the 
global prevalence of insecticide- 
producing and herbicide-tolerant 
GM products is at more than 
98% after almost 20 years (see 
go.nature.com/jyux8p). These 
factors mean that such claims 
need to be realized and qualified 
if they are to be taken seriously. 

Those seeking commercial 
acceptance of GM maize still 
need to convince key groups 
in Mexican society, including 
scientists, that the benefits of 
planting it will outweigh the 
risks and social costs. There is 


more to maize in Mexico than 
productivity and business, and 
it is not only scientists and seed 
companies who have rights. 
Hugo Perales E/ Colegio de la 
Frontera Sur (ECOSUR), San 
Cristobal, Chiapas, Mexico. 
hperales@ecosur.mx 


Create ethics codes 
to curb sex abuse 


A survey published last month 
found evidence of alarming 
levels of sexual violence (towards 
26% of women and 6% of men) 
in the course of fieldwork by 
life scientists (see Nature http:// 
doi.org/t3n; 2014). Meanwhile, 
more than 50 US higher- 
education institutions are under 
investigation for their handling 
of complaints of such incidents. 
Asa rape survivor and scientist, I 
suggest measures that could help 
to counteract this situation. 
Scientific research 
organizations should draw up 
professional codes of ethics, akin 
to those of the Modern Language 
Association of America and the 
American Historical Association, 
with explicit provisions that 
denounce sexual harassment and 
discrimination on the basis of 
race, gender or sexual orientation. 
A national framework that 
academic, industrial and 
government institutions could 
sign or adapt would be an 
important step. Such proactive 
strategies would prevent 
interference with the core work 
of researchers. In the United 
Kingdom, for example, the 
Athena SWAN Charter outlines a 
series of best practices to further 
and protect women’s careers (see 
go.nature.com/mkxlrg). 
Institutions must make clear 
the repercussions for students 
and employees who transgress, 
and provide a mechanism for 
consistent enforcement (such 
as an adjudication committee) 
that would complement any legal 
redress. 
Margaret C. Hardy University of 
Queensland, Brisbane, Australia. 
m.hardy@imb.uq.edu.au 
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What females really want 


The identification of neural subcircuits used by female fruit flies to make a choice about whether to copulate with a 
potential mate provides a template for understanding how the brain integrates complex information to reach decisions. 


LESLIE C. GRIFFITH 


an intensely personal one, and we like 

to believe it is a choice made under free 
will. However, more than a century of study- 
ing other species has made it clear that specific 
internal states, together with the presence of 
particular external cues, can alter the prob- 
ability of copulation in a consistent way across 
a population, strongly suggesting that there 
are neural circuits that evaluate relevant, pre- 
determined variables and so bias behaviour. 
Decision-making circuits are present in all spe- 
cies with a nervous system, and understanding 
how the brain carries out this type of compu- 
tation is a major goal of neuroscience. Now, 
three studies (one published in the journal 
Current Biology’ and two in Neuron”) using 
different genetic strategies have identified cir- 
cuit components that control the receptivity 
of female fruit flies to male courtship, outlin- 
ing the scope of this complex decision-making 
process. 

Sexual behaviour in the fruit fly Drosophila 
melanogaster is a particularly useful model sys- 
tem for studying decision-making because it 
involves both stereotyped and plastic features. 
The complicated sex-related behaviour of flies is 
conducted by a relatively small brain, and inves- 
tigators have an arsenal of sophisticated genetic 
tools with which to reproducibly identify and 
manipulate particular neurons in freely behav- 
ing animals*, Although the courtship behaviour 
of males has been studied intensely, progress 
in understanding female reproductive behav- 
iour®* has made it apparent that females are not 
simply passive recipients of male advances. 
Instead, female flies engage in an active and 
complex decision-making process”® that deter- 
mines whether copulation occurs. The female's 
decision-making apparatus uses sensory infor- 
mation — including courtship songs produced 
by male wing vibration, visual cues and olfac- 
tory cues such as pheromones — to assess male 
fitness in the context of the internal state of the 
female herself (Fig. 1). 

Bussell et al.' undertook a genome-wide 
screen to look for genes that alter the recep- 
tivity of female flies to potential mates. They 
found that decreasing the activity of the tran- 
scription factor Abdominal-B (Abd-B) in 
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Figure 1 | To mate or not to mate. When making a decision about whether to copulate with a potential 
mate, a female fruit fly processes external sensory cues provided by the candidate male and the 
environment during the courtship ritual, in addition to information about her own internal state. These 
various inputs are integrated into decision-making circuits, and the decision is relayed by regulatory 
output circuits. Three studies have identified neurons involved in this process: Zhou et al.” identified 
the input neurons for olfactory cues and courtship songs (purple arrows); Feng et al.’ identified the 
input neurons for mating status (blue arrow); and Bussell et al.’ identified the output circuit leading to 
copulation (red arrow). The neurons related to various other inputs, such as visual cues, remain to be 


identified (grey arrows). 


female neurons decreased the rate of mating. 
The authors showed that neurons expressing 
Abd-B during development regulate the rate 
of female pausing during courtship, an indica- 
tor of receptivity. Abd-B-expressing neurons 
were activated in response to male-specific 
sensory inputs such as courtship song, but 
only in the presence of male flies (playback of 
a recording of the song alone was ineffective, 
indicating that the male probably provides 
additional visual or chemosensory cues). The 
fact that these neurons are downstream of 
sensory inputs suggests that Abd-B neurons 
are part of the receptivity-output arm of the 
fly decision-making circuitry, and are driven 
by higher centres that process and integrate 
courtship-relevant information. 

In their hunt for circuits involved in this 
sensory integration, Zhou et al.” began with 
the assumption that neurons that express 
doublesex (dsx), a gene that is differentially 
expressed in male and female reproductive 
circuits, would contribute to female-specific 
behaviour®. The authors used state-of-the-art 
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genetic techniques’ to identify and manipulate 
small populations of neurons expressing Dsx 
protein in the female adult brain, and found 
that activation of two neuron groups, pC1 and 
pCd, enhanced the rate of copulation. 

Zhou and colleagues provide anatomical 
evidence to suggest that pC1 and pCd convey 
information between brain areas known to 
be involved in processing courtship-related 
sensory information”®. Using calcium levels 
as an indicator of neuronal activation, the 
authors showed that pCd was responsive to 
cis-vaccenyl] acetate, a male-specific lipid 
pheromone that enhances female receptivity. 
Male song activated pCl, and this response 
was enhanced by the presence of cis-vaccenyl 
acetate, suggesting that pC1 neurons convey 
integrated information. Thus, pC1 and pCd 
are part of the central circuitry that processes 
courtship-related information. 

In addition to extrinsic cues, receptivity 
is dependent on the female’s internal state. 
Females that have recently mated do not copu- 
late, even if presented with a fit and eager male. 


This change in behaviour is due to transfer of 
the protein sex peptide (SP) to the female in 
the male's seminal fluid. SP-responsive recep- 
tors have been identified’ on nerve cells in the 
abdominal ganglion (a neuronal structure 
roughly analogous to the spinal cord) that 
innervate the reproductive tract, but it was 
unknown how the SP signal is transmitted to 
the central nervous system. 

Feng et al.’ looked for groups of neurons 
that decreased female receptivity when electri- 
cally silenced. Their screen identified a group 
of neurons that project from the abdominal 
ganglion into the brain, which they named 
SP abdominal ganglion (SAG) neurons. The 
authors observed that these neurons are 
not themselves responsive to SP, but rather 
receive information from SP-sensitive neurons 
through synapses (junctions that transfer sig- 
nals between cells). Crucially, the strength of 
this connection was modulated by SP and cor- 
related with the female’s mating status. These 
data establish SAG neurons as a conduit of 
information on mating status from the repro- 
ductive tract to the central nervous system. 

Except to those of us who are dedicated fly 
voyeurs, the importance of these individual 
studies might be debatable. In aggregate, 
however, they provide a framework that could 
lead to a detailed cellular and molecular under- 
standing ofa multifactorial decision-making 
process. By highlighting three different and 
as-yet-unconnected regions of the female fly’s 
sexual-behaviour circuitry, these studies pro- 
vide starting points for completing the wiring 
diagram. 

Flies are faced with many of the same basic 
challenges as humans: what to eat, when to 
sleep and whom to mate with. Choice — our 
exercise of free will — is the probabilistic rep- 
resentation of integration processes that are 
rooted in the molecular and neural architec- 
ture of our brains. The genetic and electro- 
physiological tools available in the fly make 
this model organism arguably the best place 
to get our first glimpse into how a brain can 
make complex decisions. = 
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Sandcastles in space 


Analysis of a kilometre-sized, near-Earth asteroid shows that forces weaker 
than the weight of a penny can keep it from falling apart. This has implications 
for understanding the evolution of the Solar System. SEE LETTER P.174 


DANIEL J. SCHEERES 


ur logical concepts for how asteroids 
should behave have taken another 
knock, as evidenced in a paper 
by Rozitis et al.' on page 174 of this issue. 
The researchers establish that a kilometre- 
sized, near-Earth asteroid known as 
(29075) 1950 DA is covered with sandy rego- 
lith (the surface covering of an asteroid) and 
spins so fast — one revolution every 2.12 hours 
— that gravity alone cannot hold this material 
to its surface. This places the asteroid in a sur- 
real state in which an astronaut could easily 
scoop up a sample from its surface, yet would 
have to hold on to the asteroid to avoid being 
flung off. 
Rozitis and colleagues show that for this 
rubble-pile asteroid (the body has a porosity of 
roughly 51%) to stay in one piece, it must have 


cohesive strength — just not very much. On the 
basis of the density, size and shape of 1950 DA, 
the authors find that the asteroid requires 
a cohesive strength of at least 64 pascals 
to hold all of its rubble-pile components 
together: similar to the pressure that a penny 
exerts on the palm of your hand. 

This strength is consistent with, but much 
more precisely determined than, similar 
levels of cohesive strength that have been 
deduced for rubble-pile asteroids on the basis 
of spin-rate and size statistics of asteroids’ and 
on the inferred strength, size and spin rate of 
the active asteroid P/2013 R3 (ref. 3). This 
asteroid was recently observed to comprise 
several chunks that are slowly escaping from 
each other, probably owing to rotational dis- 
ruption’. A model for how to generate sucha 
modest level of strength in geophysical bodies 
has been hypothesized’, and achieves this 


Figure 1 | Cohesive forces in regolith. Computer simulations’ of two metre-sized boulders (pink 
spheres) with loosely packed centimetre-sized regolith (green and blue particles) between them. The 


whole system is under self-gravitational attraction, and to determine its strength, the boulders are pulled 
apart with an increasing force. Panels a and d show the initial configuration, with the pulling force exactly 
equal to the gravitational attraction. Panels b, e, and c, f show the system response for equal forces beyond 
the gravitational limit. Ifthe regolith has no cohesive strength (panels a—c), it immediately separates from 
the boulders once they are pulled with a force greater than their gravitational attraction, which leaves the 
regolith behind to aggregate under its own self-gravity and provides no extra strength to the system. If 
there are cohesive van der Waals forces between the regolith particles (panels d-f), the particles serve as 

a glue and strengthen the bond between the boulders. The level of cohesion required to hold rubble-pile 
asteroid (29075) 1950 DA together, found by Rozitis et al.’ to be 64 pascals, can be generated by a loosely 
packed regolith with particles as fine as roughly 10 micrometres’. 
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through ‘dry cohesior arising from van der 
Waals forces between components of a rubble 
pile (Fig. 1). In this theory, the finest grains 
(potentially as small as 1-10 micrometres) 
in a rubble pile that are present in sufficient 
numbers to connect all larger grains pro- 
vide a very weak cement that can hold the 
body together — a fairy-dust version of a 
sandcastle. 

Although this image of fairy-castle aster- 
oids is entertaining, the implications of these 
measurements are far-reaching. A defining 
feature of the rubble pile 1950 DA is that it is 
globally in a microgravity environment — the 
centrifugal forces from its rapid spin rate are 
nearly balanced by its gravitational attraction, 
with the difference between them being a tiny 
fraction of Earth's gravity. In such a regime, 
weak van der Waals forces can dominate’. 
The evident stability of such a strange body 
as 1950 DA exposes our ignorance of how the 
geophysics of asteroids works in the micro- 
gravity regime, with its current state being dif- 
ficult to reconcile with classical views of how 
rubble-pile bodies form from catastrophically 
disrupted parent bodies. Although Rozitis 
et al. lay out a plausible story for the current 
state of 1950 DA, the development of a com- 
plete theory of microgravity geophysics could 
have significant consequences, beyond this 
single case, for our evolving understanding of 
asteroids and the Solar System. 

For asteroids, the larger implications of such 
a weakly cohesive material — for example, 
the dissipation of energy in their interiors®, 
the shedding of material from their surfaces’ 
and the creation of binary asteroid systems 
through the fissioning of rapidly rotating rub- 
ble piles*” — have yet to be fully explored and 
understood. Going beyond asteroids, many 
different bodies and environments in the past 
and present Solar System lie in microgravity 
regimes similar to that of 1950 DA, where iner- 
tial, gravitational and weak molecular forces 
may be simultaneously relevant. The effects of 
the interplay of these forces in, for instance, 
the creation and destruction of transient 
structures in planetary ring systems and the 
accretion of grains in protoplanetary disks all 
become ripe topics for investigation motivated 
by this example. 

Coming back to near-Earth asteroids, this 
result and the underlying theory also have 
ramifications for the exploration of small 
asteroids such as 1950 DA, currently a topic of 
great interest to national space agencies and 
a few private corporations. Small amounts of 
cohesion in an asteroid’s regolith can enable 
its surface to become ‘perched, just waiting 
for a meteorite impact (or passing astronaut) 
to destabilize it — similar to avalanches on 
Earth. The global strength of such rubble-pile 
asteroids held together by these weak forces 
is also unclear. How often might avalanches 
consume the entire body, causing it to split 
and disassemble? Recent observations of 


active asteroids seem to indicate that such 
natural outcomes might not be that rare*”. 

The ability for human or robotic inter- 
actions to create such global changes to a 
small asteroid suggests an intriguing vision 
of geophysical laboratories in space. Given 
that small, near-Earth asteroids are accessible 
using spacecraft, it becomes possible to do 
controlled geophysical experiments on these 
bodies that result in global and locally meas- 
urable changes. This would allow us to probe 
the geophysics of microgravity aggregates in 
their natural environments, and to do so at 
scales that cannot be recreated on Earth or in 
Earth’s orbit, at the cost of a modest planetary- 
science mission. 

Independent of whether we choose to take 
advantage of such natural laboratories in the 
near future, humanity might eventually have 
no choice, because 1950 DA is due to pay an 
uncomfortably close visit to Earth. The aster- 
oid is one of the most potentially hazardous 
known, with a 1 in 4,000 chance of impacting 
the Earth in the year 2880 (ref. 10). Such an 
impact could have planet-wide consequences 
owing to the asteroid’s size. Among the many 
proposed methods for deflecting this hazard 
is to run a massive spacecraft into it at high 
speed, or to set off a nuclear blast in close 
proximity’. However, for this weakly bound 
body, we should wonder whether such an 
attempt would make it crumble and fall apart 
like a sandcastle that has been baked in the 
sunshine. 

Whether the impact from such a disag- 
gregated asteroid would pose a larger threat 
to Earth has been a matter of debate in the 
scientific community. Whereas a single aster- 
oid packs a larger punch, the shotgun spray 


from a disaggregated body may hit multiple 
sites across the globe. For a rapid rotator such 
as 1950 DA, however, this is not a relevant 
question. Once released from each other, 
the speed of the body’s components relative 
to the asteroid’s centre of mass would range 
from tens of centimetres per second if it split 
in half, to up to 50 cm s"' for material that 
might break off its surface. These speeds are 
much greater than most mitigation tech- 
niques could deliver to the parent asteroid. 
This would cause the components to drift rel- 
ative to the initial impact trajectory by more 
than one Earth radius in less than a year — 
sparing humanity from having to resolve such 
a delicate question. = 
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Old blood stem 
cells feel the stress 


Ageing is accompanied by deterioration in the haematopoietic stem cells that are 
responsible for regenerating the blood system. Cellular stress in the aged stem 
cells could be a cause of this decline. SEE LETTER P.198 


JIRI BARTEK & ZDENEK HODNY 


issue renewal is a fundamental process 
that relies on the regenerative capac- 
ity of long-lived, self-renewing stem 
cells. But during ageing, stem-cell function 
deteriorates. The haematopoietic stem cells 
(HSCs) that maintain all blood-cell lineages 
are, like other long-lived stem cells, prone to 
accumulating DNA damage as they age. In the 
case of HSCs, the damage can reduce the cells’ 
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ability to regenerate blood-cell lineages, and 
can increase the risk of diseases such as leukae- 
mia. But little is known about what causes the 
damage and how it contributes to the decline 
of old HSCs. On page 198 of this issue, Flach 
et al.' report that damage is caused mostly by 
cellular stress that arises as a result of ineffi- 
cient DNA replication, and they point to the 
probable molecular defects involved. 

DNA damage occurs when cells cannot 
repair genetic inaccuracies, which frequently 
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Figure 1 | Replication stress in ageing cells. a, In young haematopoietic stem cells (HSCs), the MCM 
protein complex promotes DNA replication, whereby new DNA strands are generated. In a separate 
process, the messenger RNA (mRNA) produced during gene transcription moves to a structure called 
the ribosome to be translated into proteins. These processes enable HSCs to self-renew and give rise to 
all blood-cell lineages. b, Flach et al.' report that aged HSCs have lower levels of two subunits of MCM, 
MCM4 and MCM6, than do young cells, which prevents the MCM complex from working properly, 
resulting in DNA replication stress. DNA damage associated with replication stress is not properly 
repaired in old HSCs, which leads to abnormalities in genes encoding ribosomal components, impaired 
ribosome assembly and reduced protein production. These combined stresses on HSCs lead to abnormal 


production of blood-cell lineages. 


arise while DNA is being replicated during 
cell proliferation. The idea that DNA damage 
is a major driver of the deterioration of stem 
cells in general, and old HSCs in particular, is 
supported by the fact that both mice and peo- 
ple with deficiencies in DNA repair age more 
quickly than those without such deficiencies*>. 
But debate over the potential causes of DNA 
damage in HSCs has been lively and multi- 
faceted, because factors both intrinsic to the 
cell itself (for example, loss of cell polarity) and 
extrinsic (such as secreted proteins or changes 
in the types of cell surrounding the HSCs) can 
affect the environment in which old HSCs 
reside®. 

To investigate the origin and impact of DNA 
damage in aged HSCs, Flach and colleagues 
compared HSCs isolated from the bone mar- 
row of young and old mice. Compared with 
young cells, old HSCs showed a functional 
decline, together with signalling indicative of 
DNA damage, which the authors gauged by 
presence of the yH2AX protein. yH2AX was 
accompanied by an increased abundance of 
proteins associated with inefficient DNA rep- 
lication (known as DNA replication stress)’. 
These proteins promote signalling by the 
enzyme ATR, which modifies many cellular 
functions’”. 

Following up on this unexpected result, the 
authors found that ATR signalling was acti- 
vated in old HSCs, another indication that they 
were subject to replication stress. The cells also 
showed delayed entry into and progression 


through S phase, the period of the cell cycle 
in which the genome is replicated. Further- 
more, DNA replication frequently stalled in 
old HSCs, and the number of 53BP1 bodies 
— structures that mark chromosomal breaks 
in the nuclei of cells that have experienced 
replication stress* — rose. 

To look at what molecular defects could be 
responsible for enhanced replication stress 
in aged HSCs, Flach et al. compared gene- 
expression profiles in young and old HSCs. 
Genes encoding the proteins MCM4 and 
MCM6 (two components of an MCM protein 
complex that is essential for proper replication) 
showed lower expression in old than young 
HSCs, as did a variety of other factors. 

The authors found that experimental deple- 
tion of MCM4 and MCM6 in young HSCs 
impaired the cells’ function. Like old HSCs, 
the altered cells had a poor capacity to regen- 
erate the blood system when transplanted 
into mice, suggesting that low levels of MCM4 
and MCMé6are linked with replication stress, 
and thereby with functional deterioration. In 
agreement with this, young HSCs were also 
impaired if replication stress was caused by 
chemical compounds. 

Finally, Flach et al. investigated why yH2AX 
was present in HSCs that had stopped prolifer- 
ating and therefore could not be experiencing 
replication stress. They found signs of long- 
term damage signals in genes within riboso- 
mal DNA (rDNA), which includes many genes 
that encode components involved in assembly 
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of the ribosome (the cellular machinery 
responsible for producing protein from 
messenger RNA). This makes sense, because 
rDNA is difficult to replicate and is therefore 
prone to replication stress. The authors showed 
that persistent damage was linked to lowered 
expression of rDNA genes. Consequently, the 
cells made fewer ribosomes, and could not 
produce enough protein to sustain cellular 
function — a state known as ribosomal bio- 
genesis stress”. 

Overall, Flach and colleagues’ work shows 
that old HSCs experience both replication 
stress and ribosome biogenesis stress. The 
former probably triggers the latter, and is 
clearly at least partly responsible for impaired 
blood regeneration in advanced age (Fig. 1). 
The results have broad implications for 
medicine, and raise many questions. For 
example, is replication stress involved in the 
deterioration of ageing stem cells in other tis- 
sues? Is the authors’ mechanism relevant to 
human HSCs? 

Because replication stress underlies many 
tumours”, it is possible that stress in HSCs 
contributes to the progressive accrual of gene 
mutations that cause ageing-related cancers of 
the blood. It will be interesting to determine 
how ribosome biogenesis stress influences 
HSC decline, and to investigate whether the 
p53 tumour-suppressor protein — a known 
sensor of both replication and ribosomal 
stress*”'° — is involved. 

Finally, could restoration of MCM4 and 
MCM6 levels avert replication stress or even 
functional decline in old HSCs? If it could, 
understanding how MCM genes are inhibited 
in old age might be a good starting point for 
defining strategies to postpone, prevent or 
even reverse the deterioration of the ageing 
blood-regeneration system. m= 
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CONDENSED-MATTER PHYSICS 


Glasses made from 


pure metals 


The experimental realization of amorphous pure metals sets the stage for studies 
of the fundamental processes of glass formation, and suggests that amorphous 
structures are the most ubiquitous forms of condensed matter. SEE LETTER P.177 


JAN SCHROERS 


n page 177 of this issue, Mao and 
() colleagues’ report a method that 

allows them to achieve a long-standing 
goal for materials scientists — the formation 
of glasses from pure metals. This will enable 
much-needed studies of glass formation in 
simple systems, and allows computational 
modelling of the processes involved. 

For thermodynamic reasons, most liquids 
become crystalline when they are cooled below 
their ‘liquidus’ temperature, above which 
substances are completely liquid. Crystalliza- 
tions occur at different timescales and can be 
suppressed by fast cooling ofa liquid, causing 
it to vitrify into a glass’. Vitrification occurs 
for various materials at widely different criti- 
cal cooling rates (R.) — the minimum rate of 
cooling required to form a glass. 

Glass formation has been reported for 
metallic alloys®. An alloy’s glass-forming 
ability increases with the number of compo- 
nents in the alloy, particularly if it con- 
tains elements with atomic sizes that 
differ by more than 12% and which 
have the thermodynamic impetus 
to mix*. Some alloys that exhibit these 
criteria, known as bulk metallic glasses, 
have remarkably good glass-forming 
abilities, with R, values of less than 
1,000 kelvin per second (compara- 
ble with the cooling needed to make 
amorphous polymers). They also have 
critical casting thicknesses — the larg- 
est thickness over which heat can be 
extracted enough to avoid crystalliza- 
tion — exceeding 1 millimetre. So far, 
hundreds of complex alloys have been 
reported to form bulk metallic glasses. 

Pure metals do not fulfil the above 
criteria because they lack the complex- 
ity needed to ‘confuse’ crystallization’. 
They have therefore been considered to 
be poor glass formers®. Even advanced 
rapid-cooling techniques have been 
too slow to avoid crystallization of liq- 
uid pure metals, except in some specific 
cases’. Mao and co-workers now intro- 
duce a general ultra-rapid heating and 
cooling method that allows liquids of 
pure metals to be vitrified. 


The authors used a nanometre-scale 
heating device that brings together two metal 
tips approximately 100 nm in length. Heat- 
ing was accomplished using a short electrical 
pulse (about 4 nanoseconds in duration), 
which rapidly melted the tips. The heat then 
dissipated rapidly through the melted sample 
towards the device, inducing cooling rates of 
roughly 10’* kelvin per second at the centre of 
the sample. Such high cooling rates were pre- 
dicted by the researchers to occur on the basis 
of molecular-dynamics modelling, and caused 
vitrification of a region of pure metals approxi- 
mately 40 nm by 50 nm in size. 

Metallic glasses are pursued for commercial 
applications because they exhibit attractive 
mechanical properties such as high strength, 
elasticity and processability*. The advent of 
metallic-glass formation, along with meth- 
ods that allow the liquid state of metals to be 
studied at slow, experimentally accessible time- 
scales, have also been exciting for fundamen- 
tal science. These developments have enabled 
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Figure 1 | Ultra-rapid cooling causes a pure metal to form 
aglass. The micrograph shows a region of molten tantalum 
between two crystalline regions. Mao and colleagues’ report that, 
on ultra-rapid cooling, the crystalline regions grow into the liquid 
region (blue arrows) until the growth kinetics can no longer keep 
up with the thermal field defined by the cooling rate. The liquid 
in front of the interface then ‘freezes’ into a glass. On heating, the 
crystal-liquid interface moves out into the crystalline sections of 
the sample (red arrows). 
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study of the properties of metallic liquids, and 
investigation of both the transition to the crys- 
talline state and the glass transition. However, 
the fact that multicomponent alloys have been 
needed for glass formation has complicated the 
study of metallic glasses. 

In multicomponent systems, glass forma- 
tion depends on atomic-size differences and 
attraction between atoms of different ele- 
ments. Glass formation is also affected by 
the fact that crystallization in alloys typically 
requires a change in atomic composition: 
long-range diffusion is needed to establish the 
difference in composition between the liquid 
and the growing crystalline phase. Such dif- 
fusion has a long timescale and slows down 
crystallization, facilitating glass formation. 
But it also obscures the fundamental and 
ubiquitous aspects of vitrification that would 
be observed in simple systems. Mao and col- 
leagues’ breakthrough allows glass formation 
to be studied in its purest form, and their 
findings confirm theoretical and modelling 
predictions that glass formation can occur in 
pure metals. 

The researchers studied metals in which 
atoms adopt a ‘body-centred cubic’ (bcc) 
arrangement in the crystalline solid phase. But 
what would happen for metals that adopt dif- 
ferent crystal structures, such as the common 
face-centred cubic (fcc) arrangement? Glass 
formation is limited only by crystal growth 
in Mao and co-workers’ heating device, and 
crystal growth rates are slower for bcc crys- 
tals than for fcc ones. The R, values for pure 
fcc metals are therefore expected to be even 
higher than those reported by Mao et al. for 
pure bcc metals. 

In its most general form, crystal- 
lization involves nucleation — the 
initial formation of tiny crystals called 
nuclei — and growth. Glass forma- 
tion competes with the combination 
of both processes. However, crystal- 
lization proceeds through growth into 
the undercooled liquid phase of the 
crystal—liquid interface in Mao and 
co-workers’ experiments (Fig. 1). Crys- 
tal growth therefore does not depend on 
nucleation in their system, which means 
that the R. values reported by the authors 
are probably an overestimate for the most 
general form of vitrification in pure bec 
metals. To involve nucleation, direct 
contact of the liquid phase to a crystal- 
line boundary has to be avoided. Such 
an experimental realization would allow 
study of the earliest stages of nucleation 
— one of the great mysteries of physics. 

Experimental investigations of glass 
formation have typically been carried 
out on large samples of more than 10° 
atoms, and at long timescales greater 
than 1 microsecond. By contrast, 
molecular-dynamics simulations have 
been limited to small samples of fewer 
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than 10° atoms and short timescales (less than 
1 nanosecond), because of the restrictions 
of available computing power. Our ability to 
predict experimental results from such simu- 
lations has therefore been limited because 
the properties of metallic glasses are affected 
by sample size’ and cooling rates’®. Mao and 
colleagues’ method now allows us to carry 
out experiments at spatial and temporal 
timescales similar to those in simulations. 
This opens the way to exploring glass forma- 
tion and its competition with crystallization. 


Given that vitrification has previously been 
observed in ionic melts, aqueous solutions, 
alloy melts, molecular liquids and polymers, 
the finding that pure metals can also be glasses 
suggests that amorphous structures are the most 
ubiquitous form of condensed matter. m 
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One cell at a time 


Single-cell DNA sequencing of two breast-cancer types has shown extensive 
mutational variation in individual tumours, confirming that generation of 
genetic diversity may be inherent in how tumours evolve. SEE ARTICLE P.155 


EDWARD J. FOX & LAWRENCE A. LOEB 


ext-generation DNA sequencing has 

revolutionized the field of cancer 

genomics’. Although this sequenc- 
ing can identify the most frequent muta- 
tion in a population of cells, it struggles to 
resolve the mutational diversity and multiple 
genomes of the individual cells that comprise 
a tumour. Achieving DNA sequencing down 
to the resolution of a single cell has been a 
long-held dream for understanding the cel- 
lular heterogeneity that is inherent in many 
complex biological systems and, in particu- 
lar, for delineating the mixture of genomes 
in human cancers’. On page 155 of this issue, 
Wang et al.’ report an innovative sequencing 
method, termed nuc-seq, that achieves almost 
complete sequencing of whole genomes in 
single cells. 

As a cell prepares to divide, it replicates the 
DNA in its nucleus. By sorting and sequenc- 
ing only the newly ‘doubled’ nuclei, nuc-seq 
takes advantage of this duplication to achieve 
lower rates of sequencing errors than most 
previous techniques*. The authors validated 
their method using targeted duplex sequenc- 
ing, a protocol that sequences both strands of 
DNA to identify mutations at exceptionally 
high accuracy’. They suggest that the use of 
nuc-seq to sequence single-cell genomes, with 
validation by targeted deep sequencing, will be 
instrumental in defining the genomic hetero- 
geneity of cancers. 

To demonstrate this, Wang et al. used their 
technique to sequence the genomes of multiple 
single cells from two types of human breast can- 
cer, and found that no two individual tumour 
cells were genetically identical. As well as the 
large numbers of mutations that are common 
to the majority of cells in a tumour, the authors 


uncovered an even greater number of subclonal 
and de novo mutations (those that are unique 
to individual cells). They also present estimates, 
derived from mathematical models, of muta- 
tion rates of single cells within tumours. On the 
basis of these models, they show that distinct 
types of DNA alteration seem to accumulate at 
different rates in different tumours, and suggest 
that two separate ‘mutational clocks’ operate in 
cancer. Large-scale, structural changes in DNA 
(such as amplification and deletion of large 
blocks of DNA) probably occur early in tumour 
development, in punctuated bursts of evolu- 
tion, whereas point mutations may accumulate 
more gradually, generating extensive subclonal 
diversity. The authors’ findings indicate that 
slower-growing ‘luminal’ breast-cancer cells 
exhibit relatively low mutation rates, whereas 
cells from clinically more aggressive, ‘triple- 
negative’ breast cancers have mutation rates 
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that are 13 times greater than in normal cells. 

Nuc-seq and comparable single-cell 
sequencing methods” will allow a more 
detailed understanding of mutational hetero- 
geneity in individual tumours, and will 
influence our understanding of how cancers 
evolve and our approach to their treatment. 
In particular, mutational diversity within a 
tumour is likely to be predictive of whether 
resistance to a particular chemotherapy will 
emerge during treatment, because mutations 
in genes that render cells resistant to specific 
drugs may exist before initiation of therapy. 
This has previously been documented for the 
failure of certain molecularly tailored cancer 
treatments”®. Such findings also reinforce the 
fact that single, bulk sampling ofa tumour — a 
strategy that is commonly used to select tar- 
geted therapies — is not representative of the 
tumour as a whole. 

The total number of mutations that a 
tumour genome carries, including those pre- 
sent in only a small subset of cells, may in fact 
underlie the aggressiveness of different cancer 
subtypes. For example, the extent of genetic 
diversity within a tumour, and its divergence 
from normal tissue, probably influences the 
ability of the immune system to distinguish 
malignant cells from normal cells. Identifying 
the mechanisms by which cancer cells gener- 
ate mutational heterogeneity may therefore 
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Figure 1 | Levels of diversity. The genetic characteristics of cancers vary between patients, between 
primary and metastatic tumours in a single patient, and between the individual cells of a tumour. Wang 
et al.’ present a single-cell, whole-genome sequencing technique that will allow a better understanding of 


genetic heterogeneity within individual tumours. 
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present previously unexplored therapeutic 
targets. 

Anarray of techniques to analyse individual 
cells has now been developed. It remains to be 
seen, however, just how robust nuc-seq and 
other single-cell genomics techniques, such 
as MALBAC*, will prove to be. For example, 
many cancer cells are aneuploid (they carry 
abnormal numbers of chromosomes), and 
the application of nuc-seq may be restricted 
to cancers that do not exhibit aneuploidy. 
Also, although the cost of genome sequencing 
continues to decline (albeit more slowly now 
than in the past), the cost of single-cell genom- 
ics and the complexities of the bioinformatic 
analyses involved are still formidable. 

In our quest to decipher cancer genomes, 
the advent of single-cell sequencing marks a 
technical milestone. It crystallizes the concept 


ASTRONOMICAL INSTRUMENTATION 


that the genome of each tumour is dynamic 
and highly diverse, whether we are compar- 
ing cancer genomes between tumours of 
different patients, between anatomically dis- 
tinct regions of a tumour within a patient or 
even between individual cells within the same 
tumour (Fig. 1). Single-cell sequencing will 
allow us to detect rare mutant subpopulations 
hidden within cancers that could expand 
and lead to drug resistance, and thus to avoid 
unnecessary and potentially harmful admin- 
istration of ineffective, toxic therapies. Ulti- 
mately, the exceptional plasticity of the tumour 
genome may well prove to be a key character- 
istic of cancer" anda major, as yet untapped, 
therapeutic vulnerability. m 
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Atmospheric blurring 
has anew enemy 


A fully automated optics system that corrects atmospheric blurring of celestial 
objects has imaged 715 star systems thought to harbour planets, completing each 
observation in less time than it takes to read this article. 


BRENT ELLERBROEK 


t the time of writing, observations from 
At Kepler Space Mission have yielded 

more than 975 confirmed exoplanet 
detections from 4,234 candidates’. These can- 
didates are identified by small, periodic drops 
in the brightness of the star, indicating that a 
planet might be transiting in front of it”. This 
is perhaps the conceptually simplest method 
of finding exoplanets, and it remains the 
only approach that can find planets with the 
proper orbit and radius to potentially support 
life. However, follow-up observations of high 
spatial resolution are needed to confirm and 
characterize each candidate system detected by 
the Kepler mission. Writing in The Astrophysi- 
cal Journal, Law et al.° describe how they have 
used a robotic adaptive optics system*” to fol- 
low up 715 of the Kepler candidate star systems 
in just 36 hours of observing time. 

In planetary-transit observations, the size of 
the planet can be inferred from the relative dip 
in star brightness measured during the transit. 
Only a small fraction of exoplanets will transit 
their star when viewed from Earth, buta statis- 
tical analysis° of the Kepler candidates detected 
in observations of more than 100,000 stars indi- 
cates that exoplanets may be relatively common. 
This includes Earth-sized planets with orbits 
that would permit liquid water to exist on the 


planets’ surfaces®. Because Kepler was designed 
to continually monitor many thousands of stars, 
its images lack the spatial resolution needed to 
characterize individual star systems in further 
detail. This means that various false-positive 
detections — for example, those associated with 
the partial eclipse of a star in a binary system 
by its companion star — cannot be ruled out, 
and that possible binary host stars (or stars in 
the foreground or background by coincidence) 
cannot necessarily be identified. 

The presence or absence of a stellar com- 
panion to a ‘primary host star’ is important 
information for understanding the formation 
and development of planetary systems. It also 
affects the estimation of the planet’s size from 
the transit: the relationship between star bright- 
ness and planetary radius is more complex if 
there is a stellar companion, leading to incor- 
rect results if the existence of the companion 
star is unknown. For these and other reasons, 
follow-up observations with high angular 
resolution are needed to fully understand each 
Kepler candidate. 

High-resolution follow-up images could be 
collected by a space-based observatory such 
as the Hubble Space Telescope, but observ- 
ing thousands of candidate systems would 
monopolize this limited resource. Obtaining 
such images from the ground is made diffi- 
cult by the blurring (‘seeing’) introduced by 
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atmospheric turbulence, and by the resulting 
inhomogeneity in the density and refractive 
index of the air. In the past two decades, 
ground-based observatories have begun 
using a technology known as adaptive optics to 
measure and correct this blurring in real time’. 
Many of these systems now use lasers to create 
artificial ‘guide stars’ on the sky to measure the 
blurring, and then correct it for science targets 
that are themselves too faint to be used for such 
measurement — as is the case for many of the 
Kepler candidates. Adaptive optics surveys 
of the candidate systems began in 2011-12 
(refs 8,9), but these initial studies were limited 
to fewer than 100 targets because of the time 
taken to set up and initiate each observation, 
typically at least 15-20 minutes per target for 
most current adaptive optics systems. 

The robotic adaptive optics system 
(Robo-AO) used by Law and colleagues super- 
sedes these constraints. The system has been 
designed*” for highly efficient, automated 
high-resolution observing on 1- to 3-metre- 
class telescopes, and has been mounted on 
the 60-inch (1.5-metre) telescope at the Palo- 
mar Observatory in California (Fig. 1). The 
atmospheric blurring at Palomar Observatory 
is typically about 0.65 arcseconds. Robo-AO 
sharpens star images to about 0.12-0.15 arc- 
seconds in diameter** — not far from the 
0.09-arcsecond value that is theoretically pos- 
sible with a 1.5-metre telescope in space. This 
performance has enabled Law et al. to resolve 
53 of the 715 Kepler candidates observed by 
Robo-AO so far into multiple stars. Forty- 
three of these 53 are new discoveries, includ- 
ing one that is a probable false positive for a 
candidate exoplanet. 

Of course, automated observing at a 
rate of 200-250 targets per night, as Law 
and co-workers have done, creates a sub- 
stantial data cleaning and analysis task. To 
detect and characterize companion stars 
that are significantly fainter than their pri- 
maries, the authors have developed a fully 
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Figure 1 | Creating artificial guide stars. The 
robotic adaptive optics system (Robo-AO)*”, 
which has been used by Law and colleagues’ to 
observe star systems that are thought to host 
planets, projects a laser beam above the Palomar 
60-inch telescope to generate an artificial guide 
star. This is then used to sense and correct 
atmospheric blurring. The ultraviolet beam is 
not visible to the human eye, but it can be seen 
in digital cameras after their internal filters are 
removed. 
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automated data-processing pipeline. More 
than 800 short (115-millisecond) exposures are 
collected of each target, and these must first 
be calibrated, centred and averaged into a 
single image. The light from the primary 
star is then subtracted from the image, and 
an automated target-detection algorithm is 
applied. Companion stars that are as faint as 
one one-hundredth to four one-thousandths 
of the primary can be detected in median to 
good atmospheric conditions; this is not faint 
enough to find exoplanets, but more than 
sufficient to identify many companion stars. 
Once the brightness of a companion star has 
been measured, its mass and diameter can be 
determined from standard stellar models and 
from the characteristics of the primary star. 
Law et al. have provided updated estimates 
for the planetary radii of each of the Kepler 
candidates with a fainter companion star. Five 
small planet candidates have been confirmed 
to be less than twice the diameter of Earth, but 
a larger number of other candidates could be 
significantly bigger than this threshold if they 
are found to be orbiting the fainter companion 
star instead of the primary (this would require 
future observations of their transits with Robo- 
AO orsome other high-resolution system). The 
team also suggests that several of the stars with 
multiple Kepler candidate planets are likely to 
be coincident multiples — two separate plan- 
etary systems orbiting both stars of the binary 
pair. In addition, the Robo-AO observations 
so far yield plausible (98% confidence) evi- 
dence’ that giant planets with orbital periods 
of less than 15 days are two to three times more 
likely than longer-period planets to be found 


Corralling a protein- 
degradation regulator 


The crystal structure of the COP9 signalosome, a large protein complex that 
regulates intracellular protein degradation, reveals how the complex achieves 
exquisite specificity for its substrates. SEE ARTICLE P.161 
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was linked to the dramatic changes in 

development that occur when seedlings 
push through the soil and encounter sunlight. 
This complex, named the COP9 signalosome 
(CSN), is now thought to be common to all 
animals, plants and fungi. The CSN is involved 
in protein degradation, but because of its com- 
plicated structure, detailed knowledge of how 
its activity is controlled has remained elusive. 
On page 161 of this issue, Lingaraju et al.” 


S ome 20 years ago’, an enzyme complex 


report the crystallization of the CSN and 
determine its structure to a resolution of a 
remarkable 3.8 angstroms. 

The CSN consists of eight protein subu- 
nits, CSN1-8, and regulates a family of 
enzyme complexes called cullin-RING E3 
ubiquitin ligases (CRLs)*, which modify their 
target proteins by attaching ubiquitin proteins 
to them. Ubiquitin modifications can have 
many effects on proteins, from influencing 
their cellular location to causing their degra- 
dation. In fact, the cullin protein that makes 
up the backbone of each CRL must itself be 
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in binary-star systems. This suggests that 
companion stars have a role in creating close-in 
giant planets and stabilizing their orbits. The 
researchers expect to observe every Kepler can- 
didate using Robo-AO by the end of 2014 to 
confirm this conclusion, and aim to develop a 
more comprehensive statistical sample of plan- 
etary systems associated with binary stars. 

More generally, these results are a convinc- 
ing indication that laser-guide-star adaptive 
optics is now ready for highly efficient, quan- 
titatively precise, high-resolution astronomi- 
cal observations. Current and future adaptive 
optics systems on much larger telescopes than 
the Palomar 60-inch telescope — such as Keck, 
Gemini and the European Southern Observa- 
tory’s Very Large Telescope — can produce 
even sharper images of even fainter objects, 
but much work will be needed to match the 
degree of automation and efficiency already 
demonstrated by Robo-AO. = 
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modified by a ubiquitin-like protein, NEDD8, 
before it can function as a ubiquitin ligase. The 
CSN inhibits this activity by detaching NEDD8 
from cullin, and can also bind ‘deneddylated’ 
CRLs, thereby maintaining CRL inactivity 
after NEDD8 removal*”. 

The CSN structure described by Linga- 
raju and colleagues brings to mind a widely 
splayed hand on which a small box sits askew, 
topped by a tomato (Fig. 1). Like a hand, the 
CSN has five digits (the amino-terminal ends 
of CSN1, 2, 4, 7, and 3 plus 8) projecting from 
an organizing centre, the palm. The palm is 
formed by the ‘winged-helix’ subdomains 
of these subunits, which associate to form 
a horseshoe-shaped structure. Resting on 
the hand is the box, formed by bundling of 
the carboxy-terminal ends of each subunit. 
Sitting atop this platform is the CSN5-CSN6 
tomato. 

Whereas some aspects of the CSN structure 
were anticipated from previous work on 
related proteins, it is a big surprise that 
the structure obtained by Lingaraju and 
co-workers is in an inactive configura- 
tion. The active site of the CSN is speci- 
fied by a ‘JAMM’ domain in the CSN5 
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Figure 1 | Creating artificial guide stars. The 
robotic adaptive optics system (Robo-AO)*”, 
which has been used by Law and colleagues’ to 
observe star systems that are thought to host 
planets, projects a laser beam above the Palomar 
60-inch telescope to generate an artificial guide 
star. This is then used to sense and correct 
atmospheric blurring. The ultraviolet beam is 
not visible to the human eye, but it can be seen 
in digital cameras after their internal filters are 
removed. 
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automated data-processing pipeline. More 
than 800 short (115-millisecond) exposures are 
collected of each target, and these must first 
be calibrated, centred and averaged into a 
single image. The light from the primary 
star is then subtracted from the image, and 
an automated target-detection algorithm is 
applied. Companion stars that are as faint as 
one one-hundredth to four one-thousandths 
of the primary can be detected in median to 
good atmospheric conditions; this is not faint 
enough to find exoplanets, but more than 
sufficient to identify many companion stars. 
Once the brightness of a companion star has 
been measured, its mass and diameter can be 
determined from standard stellar models and 
from the characteristics of the primary star. 
Law et al. have provided updated estimates 
for the planetary radii of each of the Kepler 
candidates with a fainter companion star. Five 
small planet candidates have been confirmed 
to be less than twice the diameter of Earth, but 
a larger number of other candidates could be 
significantly bigger than this threshold if they 
are found to be orbiting the fainter companion 
star instead of the primary (this would require 
future observations of their transits with Robo- 
AO orsome other high-resolution system). The 
team also suggests that several of the stars with 
multiple Kepler candidate planets are likely to 
be coincident multiples — two separate plan- 
etary systems orbiting both stars of the binary 
pair. In addition, the Robo-AO observations 
so far yield plausible (98% confidence) evi- 
dence’ that giant planets with orbital periods 
of less than 15 days are two to three times more 
likely than longer-period planets to be found 
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push through the soil and encounter sunlight. 
This complex, named the COP9 signalosome 
(CSN), is now thought to be common to all 
animals, plants and fungi. The CSN is involved 
in protein degradation, but because of its com- 
plicated structure, detailed knowledge of how 
its activity is controlled has remained elusive. 
On page 161 of this issue, Lingaraju et al.” 
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report the crystallization of the CSN and 
determine its structure to a resolution of a 
remarkable 3.8 angstroms. 

The CSN consists of eight protein subu- 
nits, CSN1-8, and regulates a family of 
enzyme complexes called cullin-RING E3 
ubiquitin ligases (CRLs)*, which modify their 
target proteins by attaching ubiquitin proteins 
to them. Ubiquitin modifications can have 
many effects on proteins, from influencing 
their cellular location to causing their degra- 
dation. In fact, the cullin protein that makes 
up the backbone of each CRL must itself be 
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in binary-star systems. This suggests that 
companion stars have a role in creating close-in 
giant planets and stabilizing their orbits. The 
researchers expect to observe every Kepler can- 
didate using Robo-AO by the end of 2014 to 
confirm this conclusion, and aim to develop a 
more comprehensive statistical sample of plan- 
etary systems associated with binary stars. 

More generally, these results are a convinc- 
ing indication that laser-guide-star adaptive 
optics is now ready for highly efficient, quan- 
titatively precise, high-resolution astronomi- 
cal observations. Current and future adaptive 
optics systems on much larger telescopes than 
the Palomar 60-inch telescope — such as Keck, 
Gemini and the European Southern Observa- 
tory’s Very Large Telescope — can produce 
even sharper images of even fainter objects, 
but much work will be needed to match the 
degree of automation and efficiency already 
demonstrated by Robo-AO. = 
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modified by a ubiquitin-like protein, NEDD8, 
before it can function as a ubiquitin ligase. The 
CSN inhibits this activity by detaching NEDD8 
from cullin, and can also bind ‘deneddylated’ 
CRLs, thereby maintaining CRL inactivity 
after NEDD8 removal*”. 

The CSN structure described by Linga- 
raju and colleagues brings to mind a widely 
splayed hand on which a small box sits askew, 
topped by a tomato (Fig. 1). Like a hand, the 
CSN has five digits (the amino-terminal ends 
of CSN1, 2, 4, 7, and 3 plus 8) projecting from 
an organizing centre, the palm. The palm is 
formed by the ‘winged-helix’ subdomains 
of these subunits, which associate to form 
a horseshoe-shaped structure. Resting on 
the hand is the box, formed by bundling of 
the carboxy-terminal ends of each subunit. 
Sitting atop this platform is the CSN5-CSN6 
tomato. 

Whereas some aspects of the CSN structure 
were anticipated from previous work on 
related proteins, it is a big surprise that 
the structure obtained by Lingaraju and 
co-workers is in an inactive configura- 
tion. The active site of the CSN is speci- 
fied by a ‘JAMM’ domain in the CSN5 
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subunit. Typically, the active sites 
of the enzymes in the JAMM family 
contain a zinc ion (Zn**) bound 
by three evolutionarily conserved 
amino-acid residues (two histidines 
and an aspartate), with the remaining 
ligand-binding site of Zn**occupied 
by a water molecule that has been 
activated by another evolutionarily 
conserved amino acid, glutamate 76 
(Glu 76; ref. 8). This activated water 
molecule detaches ubiquitin or ubiq- 
uitin-like proteins from their targets 
by hydrolysis. Whereas the histi- 
dine and aspartate residues of CSN5 
are positioned as expected in the 
CSN structure, the water molecule 
is replaced by another amino acid, 
Glu 104. This explains a long-standing 
puzzle: whereas other JAMM-contain- 
ing proteins efficiently cleave model 
substrates, such as ubiquitin with a 
rhodamine dye attached to its C termi- 
nus, purified CSN does so only poorly. 

Lingaraju et al. tested the role of 
Glu 104 in CSN regulation by per- 
forming enzyme assays on CSN com- 
plexes in which Glu 104 was mutated. 
This mutant cleaved ubiquitin- 
rhodamine much faster than the 
natural enzyme, indicating that 
Glu104-Zn™ binding might keep 
the CSN in an inactive state when it 
is free from CRL. Notably, mutation of the 
adjacent residue, threonine 103, results in 
defective development of the nervous system 
in fruit flies’, which suggests that Glu 104- 
mediated regulation is required for proper 
control of CSN activity in vivo. 

The inhibited state of unbound CSN raises 
the obvious question of how the CSN gains its 
activity on binding CRLs. The authors used 
computer-modelling studies to compare their 
crystal structure of free CSN with a structure 
determined by electron microscopy’ in which 
the CSN was bound to a CRL enzyme to which 
NEDD8 is attached. This comparison showed 
clearly that, to reconcile the two structures, 
substantial rearrangements of CSN2, CSN4 
and CSN5-CSN6 must occur when the CSN 
and CRL bind (Fig. 1). In particular, move- 
ments in CSN4 and CSN6 must lead to a 
change in the CSN4—CSN6 interface. 

To probe the significance of this interface, 
Lingaraju and colleagues deleted a B-hairpin 
loop in CSN6 that contributes to its inter- 
action with CSN4. Surprisingly, the resulting 
complex, like the Glu 104 mutant, efficiently 
cleaved ubiquitin-rhodamine. It also dened- 
dylated CRL more than four times faster than 
did the wild-type enzyme. These observations 
make it tempting to speculate that CSN4 is 
the signalosome’s CRL sensor, and that CSN4 
movement during CRL binding triggers a cas- 
cade of rearrangements transmitted through 
CSN6 that prise CSN5’s Glu 104 residue away 


Figure 1 | Structure of the COP9 signalosome (CSN). This 
enzyme complex is comprised of eight CSN protein subunits. Six 
subunits make up the base of the CSN, a splayed ‘hand’ in which the 
proteins’ N-terminal ends are at the fingertips and their winged- 
helix domains, drawn as circles, assemble to form the palm (partially 
obscured). The C-terminal ends of each protein are bundled 
together into a ‘box’ that sits askew on the hand. The CSN5 and 
CSN6 subunits associate intimately to form a ‘tomato sitting on the 
box. Lingaraju et al.” report that the CSN is inactive until it binds to 
its target, a cullin-RING E3 ubiquitin-ligase enzyme complex. On 
binding, the CSN undergoes activating conformational changes, 
indicated by coloured arrows that represent the movements of 

the altered subunits. For simplicity, the box is drawn as a uniform 
bundle, and so does not represent the actual position and length of 
each C terminus. (Figure adapted from Fig. 1 of ref. 2.) 


from Zn**, so that Glu 76 can move into 
position, activating the CSN. However, when 
the authors made a double mutant lacking 
both the CSN6 loop and Glu 104, they found 
it to be more active than either individual 
mutant, suggesting that these two mutations 
have independent effects, rather than acting in 
a linear cascade. Furthermore, the N-terminal 
region of CSN4 does not seem to make strong 
contact with CRL’, indicating that the CSN’s 
CRL sensor may be in another subunit. 

This study highlights a crucial lesson on the 
use of evolutionary conservation to predict 
enzyme regulation. Comparing the crystal 
structure of the CSN with those of the JAMM- 
containing enzymes AMSH-LP (ref. 10) and 
Rpn11 (refs 11, 12) reveals that, although all 
three use the same amino acids to coordinate 
Zn** and the activated water molecule, their 
activities are controlled in markedly differ- 
ent ways. AMSH-LP seems to be constitu- 
tively active, Rpn11 activity is promoted by 
rearrangements that bring the enzyme and its 
target substrate into proximity'*"*, and CSN5 
is activated by substrate-driven relief of inhibi- 
tion. Strikingly, CSN5 is inhibited by distinct 
mechanisms depending on whether the subu- 
nit is on its own” or integrated into the CSN. 
Although some generalizations apply across 
the JAMM family, it is clear that each member 
has its own distinctive features. 

What lies ahead for research on the CSN? It 
will be fascinating to examine the structure of 
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different CSN mutants, to work out the 
mechanism by which binding to CRLs 
brings about major conformational 
changes. It would also be wonderful to 
see a CSN-NEDD8-CRL complex in its 
full glory, to gain an atomic-level view 
of the CSN-CRL interface and how 
it might be influenced by NEDD8 or 
substrates that bind to CRL. Another 
question is whether binding of the CSN 
to neddylated or deneddylated CRL 
promotes the same conformational 
change in the CSN. 

Binding and kinetic studies of the 
CSN and the mutated complexes 
reported by Lingaraju et al. should 
reveal whether the CSN’s catalytic rate 
is determined by the conformational 
rearrangement that occurs on CRL 
binding. Furthermore, in vivo studies 
with Glu 104 and CSN6-loop mutants 
should show why free CSN must be 
inhibited. 

Finally, this structure may help the 
design of drugs that act on the CSN, 
which could be an attractive target 
for the treatment of breast and liver 
cancer'®'’. Although detailed char- 
acterization of CSN inhibitors has 
not been reported, my laboratory has 
identified several candidates through 
high-throughput screening (PubChem 
AID652009). The surprising observa- 
tions reported by Lingaraju et al. suggest that 
it may be possible to inhibit the CSN but spare 
other JAMM proteins, by interfering with the 
active-site rearrangement that occurs when the 
CSN and CRL bind. = 
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Limits on fundamental limits 


to computation 


Igor L. Markov'* 


An indispensable part of our personal and working lives, computing has also become essential to industries and govern- 
ments. Steady improvements in computer hardware have been supported by periodic doubling of transistor densities in 
integrated circuits over the past fifty years. Such Moore scaling now requires ever-increasing efforts, stimulating research 
in alternative hardware and stirring controversy. To help evaluate emerging technologies and increase our understanding 
of integrated-circuit scaling, here I review fundamental limits to computation in the areas of manufacturing, energy, 
physical space, design and verification effort, and algorithms. To outline what is achievable in principle and in practice, I 
recapitulate how some limits were circumvented, and compare loose and tight limits. Engineering difficulties encountered 


by emerging technologies may indicate yet unknown limits. 


ventional integrated circuits in computation bandwidth or speed, 

power consumption, manufacturing cost, or form factor'’. How- 
ever, razor-sharp focus on any one nascent technology and its benefits some- 
times neglects serious limitations or discounts ongoing improvements in 
established approaches. To foster a richer context for evaluating emerg- 
ing technologies, here I review limiting factors and the salient trends in 
computing that determine what is achievable in principle and in practice. 
Several fundamental limits remain substantially loose, possibly indicating 
viable opportunities for emerging technologies. To clarify this uncertainty, 
I examine the limits on fundamental limits. 


E merging technologies for computing promise to outperform con- 


Universal and general-purpose computers 

If we view clocks and watches as early computers, it is easy to see the impor- 
tance of long-running calculations that can be repeated with high accu- 
racy by mass-produced devices. The significance of programmable digital 
computers became clear at least 200 years ago, as illustrated by Jacquard 
looms in textile manufacturing. However, the existence of universal com- 
puters that can efficiently simulate (almost) all other computing devices— 
analogue or digital—was only articulated in the 1930s by Church and Turing 
(Turing excluded quantum physics when considering universality)’. Effi- 
ciency was studied from a theoretical perspective at first, but strong demand 
in military applications in the 1940s led Turing and von Neumann to develop 
detailed hardware architectures for universal computers—Turing’s design 
(Pilot ACE) was more efficient, but von Neumann’s was easier to program. 
The stored-program architecture made universal computers practical in 
the sense that a single computer design could be effective in many diverse 
applications if supplied with appropriate software. Such practical univer- 
sality thrives (1) in economies of scale in computer hardware and (2) among 
extensive software stacks. Not surprisingly, the most sophisticated and com- 
mercially successful computer designs and components, such as Intel and 
IBM central processing units (CPUs), were based on the von Neumann par- 
adigm. The numerous uses and large markets of general-purpose chips, 
as well as the exact reproducibility of their results, justify the enormous 
capital investment in the design, verification and manufacturing of leading- 
edge integrated circuits. Today general-purpose CPUs power cloud server- 
farms and displace specialized (but still universal) mainframe processors 
in many supercomputers. Emerging universal computers based on field- 
programmable gate-arrays and general-purpose graphics processing units 


outperform CPUs in some cases, but their efficiencies remain complemen- 
tary to those of CPUs. The success of deterministic general-purpose com- 
puting is manifest in the convergence of diverse functionalities in portable, 
inexpensive smartphones. After steady improvement, general-purpose com- 
puting displaced entire industries (newspapers, photography, and so on) 
and launched new applications (video conferencing, GPS navigation, online 
shopping, networked entertainment, and so on)*. Application-specific inte- 
grated circuits streamline input-output and networking, or optimize func- 
tionalities previously performed by general-purpose hardware. They speed 
up biomolecular simulation 100-fold** and improve the efficiency of video 
decoding 500-fold’, but they require design efforts with a keen understand- 
ing of specific computations, impose high costs and financial risks, need mar- 
kets where general-purpose computers lag behind, and often cannot adapt 
to new algorithms. Recent techniques for customizable domain-specific 
computing® offer better tradeoffs, while many applications favour the com- 
bination of general-purpose hardware and domain-specific software, includ- 
ing specialized programming languages”’” such as Erlang, which was used 
to implement the popular Whatsapp instant messenger. 


Limits as aids to evaluating emerging technologies 

Without sufficient history, we cannot extrapolate scaling laws for emerg- 
ing technologies, yet expectations run high. For example, new proposals 
for analogue processors appear frequently (as illustrated by adiabatic quan- 
tum computers), but fail to address concerns about analogue computing, 
such as its limitations on scale, reliability, and long-running error-free com- 
putation. General-purpose computers meet these requirements with digital 
integrated circuits and now command the electronics market. In compar- 
ison, quantum computers—both digital and analogue—hold promise only 
in niche applications and do not offer faster general-purpose computing 
because they are no faster for sorting and other specific tasks''""’. In exagger- 
ating the engineering impact of quantum computers, the popular press has 
missed this important point. But in scientific research, attempts to build 
quantum computers may help in simulating quantum-chemical phenomena 
and reveal new fundamental limits. The sections ‘Asymptotic space-time 
limits’ and ‘Conclusions’ below discuss the limits on emerging technologies. 


Technology extrapolation versus fundamental limits 
The scaling of commercial computing hardware regularly runs into formi- 
dable obstacles”, but near-term technological advances often circumvent 
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Table 1 | Some of the known limits to computation 


Limits Engineering Design and validation Energy, time Space, time Information, complexity 
Fundamental Abbe (diffraction); Error-correction and Einstein (E = mc?); Speed of light; Planck = Shannon channel capacity; 
Amdahl; Gustafson dense codes; fault- Heisenberg (AEA); scale; Bekenstein; Holevo bound; 
tolerance thresholds Landauer (kTIn2); Fisher (T(n)1/¢* Py NC, NP, #P; decidability 
Bremermann; 
adiabatic theorems 
Material Dielectric constant; Analytical and numerical Conductivity; permittivity; Propagation speed; Information transfer 
carrier mobility; modelling bandgap; heat flow atomic spacing; no between carriers 
surface morphology; gravitational collapse 
fabrication-related 
Device Gate dielectric; channel Compact modelling; CMOS; quantum; Interfaces and contacts; entropy density; entropy flow; 
charge control; leakage; parameter selection charge-centric; size and delay variation; universality 
latency; cross-talk; ageing signal-to-noise ratio; 
energy conversion 
Circuit Delay; inductance; Interconnect; test; Dark, darker, dim and grey silicon; interconnect; Circuit complexity bounds 
thermal-related; yield; validation cooling efficiency; power density; power supply; 
reliability; input-output two or three dimensions 
System and Specification; implementation; validation; cost Synchronization; physical integration; parallelism; The ‘consistency, 
software ab initio limits (Lloyd) availability, partitioning 


tolerance’ (CAP) theorem 


Summary of material from refs 5, 13-15, 17, 18, 22, 23, 26, 31, 39, 41, 42, 46, 48-50, 53, 54, 57-60, 62, 63, 65, 74-76, 78, 87, 96, 98 and 99. 


them. The ITRS"* keeps track of such obstacles and possible solutions with 
a focus on frequently revised consensus estimates. For example, consensus 
estimates initially predicted 10-GHz CPUs for the 45-nm technology node"*, 
versus the 3-4-GHz range seen in practice. In 2004, the unrelated Quan- 
tum Information Science and Technology Roadmap” forecast 50 ‘digital’ 
physical qubits by 2012. Such optimism arose by assuming technological 
solutions long before they were developed and validated, and by overlook- 
ing important limits. The authors of refs 17 and 18 classify the limits to 
devices and interconnects as fundamental, material, device, circuit, and 
system limits. These categories define the rows of Table 1, and the columns 
reflect the sections of this Review in which I examine the impact of specific 
limits on feasible computing technologies, looking for ‘tight’ limits, which 
obstruct the long-term improvement of key parameters. 


Engineering obstacles 


Engineering obstacles limit specific technologies and choices. For example, 
a key bottleneck today is integrated circuit manufacture, which packs bil- 
lions of transistors and wires in several square centimetres of silicon, with 
astronomically low defect rates. Layers of material are deposited on silicon 
and patterned with lasers, fabricating all circuit components simultaneously. 
Precision optics and photochemical processes ensure accuracy. 


Limits on manufacturing 

No account of limits to computing is complete without the Abbe diffrac- 
tion limit: light with wavelength 4, traversing a medium with refractive 
index 7, and converging to a spot with angle 0 (perhaps focused by a lens) 
creates a spot with diameter d = 1/NA, where NA = nsiné is the numer- 
ical aperture. NA reaches 1.4 for modern optics, so it would seem that 
semiconductor manufacturing is limited to feature sizes of //2.8. Hence, 
argon-fluoride lasers with a wavelength of 193 nm should not support pho- 
tolithographic manufacturing of transistors with 65-nm features. Yet these 
lasers can support subwavelength lithography even for the 45-nm to 14-nm 
technology nodes if asymmetric illumination and computational litho- 
graphy are used". In these techniques, one starts with optical masks that 
look like the intended image, but when the image gets blurry, the masks 
are altered by gently shifting the edges to improve the image, possibly 
eventually giving up the semblance between the original mask and the 
final image. Clearly, some limits are formulated to be broken! Ten years 
ago, researchers demonstrated the patterning of nanomaterials by live 
viruses”. Known virions exceed 20 nm in diameter, whereas subwavelength 
lithography using a 193-nm A‘F laser was recently extended to 14-nm semi- 
conductor manufacturing". Hence, viruses and microorganisms are no 
longer at the forefront of semiconductor manufacturing. Extreme ultra- 
violet (X-ray) lasers have been energy-limited, but are improving. Their 
use requires changing the optics from refractive to reflective. Additional 
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progress in multiple patterning and directed self-assembly promises to 
support photolithography beyond the 10-nm technology node. 


Limits on individual interconnects 

Despite the doubling of transistor density with Moore’s law’', semicon- 
ductor integrated circuits would not work without fast and dense inter- 
connects. Copper wires can be either fast or dense, but not both at the same 
time—a smaller cross-section increases electrical resistance, while greater 
height or width increase parasitic capacitance with neighbouring wires 
(wire delay grows with the product of resistance and capacitance, RC). As 
pointed out in 1995 by an Intel researcher, on-chip interconnect scaling 
has become the real limiter of high-performance integrated circuits”. The 
scaling of interconnect is also moderated by electron scattering against 
rough edges of metallic wires'*, which is inevitable with atomic-scale wires. 
Hence, integrated circuit interconnect stacks have evolved'*”* from four 
equal-pitch layers in 2000 to 16 layers with some wires up to 32 times 
thicker than others (as in Fig. 3) including a large amount of dense (thin) 
wiring and fast (thick) wires used for global on-chip communication (Fig. 3). 
Aluminium and copper remain unrivalled for conventional interconnects 
and can be combined in short wires”®; carbon-nanotube and spintronic in- 
terconnects are also evaluated in ref. 98. Photonic waveguides and radio 
frequency links offer alternative integrated circuit interconnect”*”’, but 
still obey fundamental limits derived from Maxwell’s equations, such as 
the maximum propagation speed of electromagnetic waves'*. The num- 
ber of input-output links can only grow with the perimeter or surface area 
of a chip, whereas chip capacity grows with area or volume, respectively. 


Limits on conventional transistors 

Transistors are limited by their tiniest feature—the width of the gate 
dielectric—which recently reached the size of several atoms (Fig. 1), creat- 
ing problems: (1) a few missing atoms can alter transistor performance, 
(2) manufacturing variation makes all the transistors slightly different 
(Fig. 2), (3) electric current tends to leak through thin narrow dielectrics”. 
Therefore, transistors are redesigned with wider dielectric layers”* that sur- 
round a fin shape (Fig. 4). Such configurations improve the control of the 
electric field, reduce current densities and leakage, and diminish process 
variations. Each field effect transistor (FET) can use several fins, extend- 
ing transistor scaling by several generations. Semiconductor manufacturers 
adopted such FinFETs for upcoming technology nodes. Going a step fur- 
ther, in tunnelling transistors’’, a gate wraps around the channel to con- 
trol the tunnelling rate. 


Limits on design effort 
In the 1980s, Mead and Conway formalized integrated circuit design using 
a regular grid, enabling automated layout through algorithms. But the 
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22 nm 


Traditional Sub-10 nm 


Figure 1 | As a metal oxide-semiconductor field effect transistor 
(MOSFET) shrinks, the gate dielectric (yellow) thickness approaches several 
atoms (0.5 nm at the 22-nm technology node). Atomic spacing limits the 
device density to one device per nanometre, even for radical devices. For 
advanced transistors, grey spheres indicate silicon atoms, while red and blue 
spheres indicate dopant atoms (intentional impurities that alter electrical 
properties). Image redrawn from figure 1 of http://cnx.org/content/m32874/ 
latest/, with permission from Gold Standard Simulations. 


resulting optimization problems remain difficult to solve, and heuristics 
are only good enough for practical use. Besides frequent algorithmic improve- 
ments, each technology generation alters circuit physics and requires new 
computer-aided design software. The cost of design has doubled in a few 
years, becoming prohibitive for integrated circuits with limited market 
penetration'*. Emerging technologies, such as FinFETs and high-x dielec- 
trics (i is the dielectric constant), circumvent known obstacles using forms 
of design optimization. Therefore, reasonably tight limits should account 
for potential future optimizations. Low-level technology enhancements, 
no matter how powerful, are often viewed as one-off improvements, in 
contrast to architectural redesigns that affect many processor generations. 
Between technology enhancements and architectural redesigns are global 
and local optimizations that alter the ‘texture’ of integrated circuit design, 
such as logic restructuring, gate sizing and device parameter selection. 
Moore’s law promises higher transistor densities, but some transistors are 
designed to be 32 times larger than others. Large gates consume greater 
power to drive long interconnects at acceptable speed and satisfy perfor- 
mance constraints. Minimizing circuit area and power, subject to timing 
constraints (by configuring each logic gate to a certain size, threshold volt- 
age, and so on), is a difficult but increasingly important optimization with 
alarge parameter space. A recent convex optimization method” saved 30% 
power in Intel chips, and the impact of such improvements grows with 
circuit size. Many aspects of integrated circuit design are being improved, 
continually raising the bar for technologies that compete with comple- 
mentary metal-oxide-semiconductors (CMOSs). 

Completing new integrated circuit designs, optimizing them and veri- 
fying them requires great effort and continuing innovation; for example, 
the lack of scalable design automation is a limiting factor for analogue 
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Figure 2 | As a MOSFET transistor shrinks, the shape of its electric field 
departs from basic rectilinear models, and the level curves become 
disconnected. Atomic-level manufacturing variations, especially for dopant 
atoms, start affecting device parameters, making each transistor slightly 
different’. Image redrawn from figure ‘DOTS and LINES’ of ref. 97, with 
permission from Gold Standard Simulations. 
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Figure 3 | The evolution of metallic wire stacks from 1997 to 2010. Stacks 
are ordered by the designation of the semiconductor technology node. 
Image redrawn from a presentation image by C. Alpert of IBM Research, 
with permission. 


integrated circuits”°. In 1999, bottom-up analysis of digital integrated 
circuit technologies'>*' outlined design scaling up to self-contained modules 
with 50,000 standard cells (each cell contains one to three logic gates), but 
further scaling was limited by long-range interconnect. In 2010, physical 
separation of modules became less critical, as large-scale placement opti- 
mizations, implemented as software tools, assumed greater responsibility 
for integrated circuit layout and can now intersperse components of nearby 
modules****. In a general trend, powerful design automation” frees circuit 
engineers to focus on microarchitecture*’, but increasingly relies on algo- 
rithmic optimization. Until recently, this strategy suffered significant losses 
in performance® and power® compared to ideal designs, but has now become 
both successful and indispensable owing to the rapidly increasing com- 
plexity of digital and mixed-signal electronic systems. Hardware and soft- 
ware must now be co-designed and co-verified, with software improving 
ata faster rate. Platform-based design combines high-level design abstractions 
with the effective re-use of components and functionalities in engineered 
systems*’. Customizable domain-specific computing® and domain-specific 
programming languages””® offload specialization to software running on 
re-usable hardware platforms. 


Energy-time limits 

In predicting the main obstacles to improving modern electronics, the 
2013 edition of the International Technology Roadmap for Semiconduc- 
tors (ITRS) highlights the management of system power and energy as 
the main challenge’. The faster the computation, the more energy it con- 
sumes, but actual power-performance tradeoffs depend on the physical 
scale. While the ITRS, by its charter, focuses on near-term projections and 
integrated circuit design techniques, fundamental limits reflect available 
energy resources, properties of the physical space, power-dissipation con- 
straints, and energy waste. 


Reversibility 

A 1961 result by Landauer” shows that erasing one bit of information entails 
an energy loss that =kTln2 (the thermodynamic threshold), where k is 
the Boltzmann constant and T is the temperature in Kelvin. This principle 
was validated empirically in 2012 (ref. 39) and seems to motivate revers- 
ible computing“, where all input information is preserved, incurring addi- 
tional costs. Formally speaking, zero-energy computation is prohibited by 


14 AUGUST 2014 | VOL 512 | NATURE | 149 


©2014 Macmillan Publishers Limited. All rights reserved 


REVIEW 


Traditional planar 


Gate Drain 


Silicon substrate 


High-« 
dielectric I 


Three-dimensional FinFET 


Drain 


Source 


Figure 4 | FinFET transistors possess a much wider gate dielectric layer (surrounding the fin shape) than do MOSFET transistors and can use multiple fins. 


the energy-time form of the Heisenberg uncertainty principle (AtAE = h/2): 
faster computation requires greater energy*!*. However, recent work 
in applied superconductivity* demonstrates “highly exotic” physically 
reversible circuits operating at 4°K with energy dissipation below the ther- 
modynamic threshold. They apparently fail to scale to large sizes, run into 
other limits, and remain no more practical than ‘mainstream’ super- 
conducting circuits and refrigerated low-power CMOS circuits. Tech- 
nologies that implement quantum circuits“ can approximate reversible 
Boolean computing, but currently do not scale to large sizes, are energy- 
inefficient at the system level, rely on fragile components, and require 
heavy fault-tolerance overheads'’. Conventional integrated circuits also 
do not help to obtain energy savings from reversible computing because 
they dissipate 30%-60% ofall energy in (reversible) wires and repeaters”. 
At room temperature, Landauer’s limit amounts to 2.85 X 107? J—a 
very small fraction of the total, given that modern integrated circuits 
dissipate 0.1-100 W and contain <10” logic gates. With the increasing 
dominance of interconnect (see section “Asymptotic space-time limits’), 
more energy is spent on communication than on computation. Logi- 
cally reversible computing is important for reasons other than energy 
reduction—in cryptography, quantum information processing, and 
so on*. 


Power constraints and CPUs 

The end of CPU frequency scaling. In 2004, Intel abruptly cancelled a 
4-GHz CPU project because its high power density required awkward 
cooling technologies. Other CPU manufacturers kept clock frequencies 
in the 1-6-GHz range, but also resorted to multicore CPUs”. Since dynamic 
circuit power grows with clock frequency and supply voltage squared”, 
energy can be saved by distributing work among slower, lower-voltage 
parallel CPU cores if the parallelization overhead is small. 

Dark, darker, dim, grey silicon. A companion trend to Moore’s law— 
the Dennard scaling theory**—shows how to keep power consumption 
of semiconductor integrated circuits constant while increasing their den- 
sity. But Dennard scaling broke down ten years ago”. Extrapolation of 
semiconductor scaling trends for CMOSs—the dominant semiconductor 
technology for the past 20 years—shows that the power consumption of 
transistors available in modern integrated circuits reduces more slowly 
than their size (which is subject to Moore’s law)*””°. To ensure acceptable 
performance characteristics of transistors, chip power density must be lim- 
ited, anda fraction of transistors must be kept dark at any given time. Modern 
CPUs have not been able to use all their circuits at once, but this asym- 
ptotic effect—termed the “utilization wall’*’—will soon black out 99% 
of the chip, hence the term ‘dark silicon’ and a reasoned reference to the 
apocalypse”’. Saving power by slowing CPU cores down is termed ‘dim 
silicon’. Detailed studies of dark silicon® show similar results. To this end, 
executives from Microsoft and IBM have recently proclaimed an end to 
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the era of multicore microprocessors”’. Two related trends appeared earlier: 
(1) increasingly large integrated circuit regions remain transistor-free to aid 
routeing and physical synthesis, to accommodate power-supply networks, 
and so on****—we call them ‘darker silicon’, (2) increasingly many gates 
do not perform useful computation but reinforce long, weak interconnects” 
or slow down wires that are too short—which I call ‘grey silicon’. Today, 
50%-80% of all gates in high-performance integrated circuits are repeaters. 
Limits for power supply and cooling. Data centres in the USA consumed 
2.2% of its total electricity in 2011. Because power plants take time to build, 
we cannot sustain past trends of doubled power consumption per year. 
It is possible to improve the efficiency of transmission lines (using high- 
temperature superconductors*) and power conversion in data centres, 
but the efficiency of on-chip power networks may soon reach 80%-90%, 
leaving little room for improvement. Modern integrated circuit power 
management includes clock-network and power gating”, per-core voltage 
scaling’®, charge recovery”’ and, in recent processors, a CPU core dedi- 
cated to power scheduling. Integrated circuit power consumption depends 
quadratically on supply voltage, which has decreased steadily for many 
years, but has recently stabilized at 0.5-2 V (ref. 47). Supply voltage typi- 
cally exceeds the threshold voltage of FETs by a safety margin that ensures 
circuit reliability, fast operation and low leakage. Threshold voltage depends 
on the thickness of the gate dielectric, which reached a practical limit of 
several atoms (see section ‘Engineering obstacles’). Transistors cannot 
operate with supply voltage below approximately 200 mV (ref. 17)—five 
times below current practice—and simple circuits reach this limit. With 
slower operation, near- and sub-threshold circuits may consume a hundred 
times less energy**. Cooling technologies can improve too, but fundamental 
quantum limits bound the efficiency of heat removal”. 


Broader limits 

The study in ref. 62 explores a general binary-logic switch model with 
binary states represented by two quantum wells separated by a potential 
barrier. Representing information by electric charge requires energy for 
binary switching and thus limits the logic-switching density, if a signifi- 
cant fraction of the chip can switch simultaneously. To circumvent this 
limit, one can encode information in spin-states, photon polarizations, 
super-conducting currents, or magnetic flux, noting that these carriers 
have already been in commercial use (spin-states are particularly attractive 
because they promise high-density nonvolatile storage®’). More powerful 
limits are based on the amount of material in the Earth’s crust (where sili- 
con is the second most common element after oxygen), on atomic spacing 
(see section ‘Engineering obstacles’), radii, energies and bandgaps, as well 
as the wavelength of the electron. We are currently using only a tiny frac- 
tion of the Earth’s mass for computing, and yet various limits could be 
circumvented if new particles are discovered. Beyond atomic physics, some 
limits rely on basic constants: the speed of light, the gravitational constant, 
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the quantum (Planck) scale, the Boltzmann constant, and so on. Lloyd” and 
Kraus™ extend well-known bounds by Bremermann and Bekenstein, and 
give Moore’s law another 150 years and 600 years, respectively. These results 
are too loose to obstruct the performance of practical computers. In con- 
trast, current consensus estimates from the ITRS" give Moore’s law only 
another 10-20 years, due to technological and economic considerations’. 


Asymptotic space-time limits 

Engineering limits for deployed technologies can often be circumvented, 
while first-principles limits on energy and power are loose. Reasonably tight 
limits are rare. 


Limits to parallelism 

Suppose we wish to compare a parallel and sequential computer built 
from the same units, to argue that a new parallel algorithm is many times 
faster than the best sequential algorithm (the same reasoning applies to 
logic gates on an integrated circuit). Given N parallel units and an algo- 
rithm that runs M times faster on sufficiently large inputs, one can simu- 
late the parallel system on the sequential system by dividing its time between 
Ncomputational slices. Since this simulation is roughly Ntimes slower, it 
runs M/N times faster than the original sequential algorithm. If this ori- 
ginal sequential algorithm was the fastest possible, we have M = N. In other 
words, a fair comparison should not demonstrate a parallel speedup that 
exceeds the number of processors—a superlinear speedup can indicate an 
inferior sequential algorithm or the availability ofa larger amount of memory 
to N processors. The bound is reasonably tight in practice for small Nand 
can be violated slightly because N CPUs include more CPU cache, but 
such violations alone do not justify parallel algorithms—one could instead 
buy or build one CPU with a larger cache. A linear speedup is optimist- 
ically assumed for the parallelizable component in the 1988 Gustafson’s 
law that suggests scaling the number of processors with input size (as illus- 
trated by instantaneous search queries over massive data sets)”. Also in 1988, 
Fisher®* employed asymptotic runtime estimates instead of numerical lim- 
its without considering the parallel and sequential runtime components 
that were assumed in Amdahl’s law® and Gustafson’s law’. Asymptotic 
estimates neglect leading constants and offer a powerful way to capture 
nonlinear phenomena occurring at large scale. 

Fisher® assumes a sequential computation with T(n) elementary steps 
for input of size n, and limits the performance of its parallel variants that 
can use an unbounded d-dimensional grid of finite-size computing units 
(electrical switches on a semiconductor chip, logic gates, CPU cores, and 
so on) communicating at a finite speed, say, bounded by the speed of light. 
I highlight only one aspect of this four-page work: the number of steps 
required by parallel computation grows as the (d + 1)th root of T(n). This 
result undermines the N-fold speedup assumed in Gustafson’s law for N 
processors on appropriately sized input data’. A speedup from runtime 
polynomial in n to approximately logn can be achieved in an abstract model 
of computation for matrix multiplication and fast Fourier transforms. But 
not in physical space®. Surprising as it may seem, after reviewing many 
loose limits to computation, we have identified a reasonably tight limit 
(the impact of input-output, which is a major bottleneck today, is also 
covered in ref. 65). Indeed, many parallel computations today (excluding 
multimedia processing and World Wide Web searching) are limited by 
several forms of communication and synchronization, including network 
and storage access. The billions of logic gates and memory elements in 
modern integrated circuits are linked by up to 16 levels of wires (Fig. 3); 
longer wires are segmented by repeaters. Most of the physical volume and 
circuit delay are attributed to interconnect”. This is relatively new, because 
gate delays were dominant until 2000 (ref. 14), but wires get slower relative 
to gates at each new technology node. This uneven scaling has compounded 
in ways that would have surprised Turing and von Neumann—a single 
clock cycle is now far too short for a signal to cross the entire chip, and 
even the distance covered in 200 ps (5 GHz) at light speed is close to the 
chip size. Yet most electrical engineers and computer scientists are still 
primarily concerned with gates. 


REVIEW 


Implications for three-dimensional and other emerging circuits 
The promise of three-dimensional integration for improving circuit 
performance can be undermined by the technical obstructions to its indus- 
try adoption. To derive limits on possible improvement, we use the result 
from ref. 65, which is sensitive to the dimension of the physical space: a 
sequential computation with T(1) steps requires of the order of T’’*(n) 
steps in two dimensions and T'4(n) in three. Letting t= T!3(n) shows that 
three-dimensional integration asymptotically reduces ¢ to f/4—a signi- 
ficant but not dramatic speedup. This speedup requires an unbounded 
number of two-dimensional device layers, otherwise there is no asymp- 
totic speedup”. For three-dimensional integrated circuits with two to three 
layers, the main benefits of three-dimensional integrated circuit integration 
today are in improving manufacturing yield, improving input-output 
bandwidth, and combining two-dimensional integrated circuits that are 
optimized for random logic, dense memory, field-programmable gate- 
arrays, analogue, microelectromechanical systems and so on. Ultrahigh- 
density CMOS logic integrated circuits with monolithic three-dimensional 
integration® suffer higher routeing congestion than traditional two- 
dimensional integrated circuits. 

Emerging technologies promise to improve device parameters, but often 
remain limited by scale, faults, and interconnect. For example, quantum 
dots enable terahertz switching but hamper nonlocal communication”. 
Carbon nanotube FETs” leverage the extraordinary carrier mobility in semi- 
conducting carbon nanotubes to use interconnect more efficiently by improv- 
ing drive strength, while reducing supply voltage. Emerging interconnects 
include silicon photonics, demonstrated by Intel in 2013 (ref. 71) and inten- 
ded asa 100-Gbs_' replacement of copper cables connecting adjacent chips. 
Silicon photonics promises to reduce power consumption and form factor. 

Ina different twist, quantum physics alters the nature of communication 
with Einstein’s “spooky action at a distance” facilitated by entanglement”. 
However, the flows of information and entropy are subject to quantum 
limits”. Several quantum algorithms run asymptotically faster than the 
best conventional algorithms”, but fault-tolerance overhead offsets their 
potential benefits in practice except for large input sizes, and the empirical 
evidence of quantum speedups has not been compelling so far’*”’. Sev- 
eral stages in the development of quantum information processing remain 
challenging”, and the surprising difficulty of scaling up reliable quantum 
computation could stem from limits on communication and entropy’*™. 
In contrast, Lloyd” notes that individual quantum devices now approach 
the energy limits for switching, whereas non-quantum devices remain orders 
of magnitude away. This suggests a possible obstacle to simulating quan- 
tum physics on conventional parallel computers (abstract models aside). 
In terms of computational complexity though, quantum computers can- 
not attain a significant advantage for many problem types''""* and are un- 
likely to overcome the Fisher limit on parallelism from ref. 65. A similar lack 
ofa consistent general-purpose speedup limits the benefits of several emerg- 
ing technologies in mature applications that contain diverse algorithmic 
steps, such as World Wide Web searching and computer-aided design. 
Accelerating one step usually does not dramatically speed up the entire 
application, as noted by Amdahl” in 1967. Figuratively speaking, the most 
successful computers are designed for the decathlon rather than for the 
sprint only. 


Complexity-theoretic limits 

The previous section, ‘Asymptotic space-time limits’, enabled tighter limits 
by neglecting energy and using asymptotic rather than numeric bounds. I 
now review a more abstract model in order to focus on the impact of scale, 
and to show how recurring trends quickly overtake one-off device-specific 
effects. I neglect spatial effects and focus on the nature of computation in 
an abstract model (used by software engineers) that represents computa- 
tion by elementary steps with input-independent runtimes. Such limits 
survive many improvements in computer technologies, and are often stron- 
ger for specific problems. For example, the best-known algorithms for mul- 
tiplying large numbers are only slightly slower than reading the input (an 
obvious speed limit), but only in the asymptotic sense: for numbers with 
less than a thousand bits, those algorithms lag behind simpler algorithms 
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in actual performance. To focus on what matters most, I no longer track 
the asymptotic worst-case complexity of the best algorithms for a given 
problem, but merely distinguish polynomial asymptotic growth from 
exponential. 

Limits formulated in such crude terms (unsolvability in polynomial 
time on any computer) are powerful”: the hardness of number-factoring 
underpins Internet commerce, while the P # NP conjecture explains the 
lack of satisfactory, scalable solutions to important algorithmic problems, 
in optimization and verification of integrated circuit designs, for example”. 
(Here P is the class of decision problems that can be solved using simple 
computational steps whose number grows no faster than a polynomial of 
the size of input data, and NP is the non-deterministic polynomial class 
representing those decision problems for which a non-deterministically 
guessed solution can be reliably checked using a polynomial number of 
steps.) A similar conjecture, P # NC, seeks to explain why many algorith- 
mic problems that can be solved efficiently have not parallelized efficiently”. 
Most of these limits have not been proved. Some can be circumvented by 
using radically different physics, for example, quantum computers can solve 
number factoring in polynomial time (in theory). But quantum computa- 
tion does not affect P ~ NP (ref. 77). The lack of proofs, despite heavy 
empirical evidence, requires faith and is an important limitation of many 
nonphysical limits to computing. This faith is not universally shared— 
Knuth (see question 17 in http://www. informit.com/articles/article.aspx? 
p=2213858) argues that P = NP would not contradict anything we know 
today. A rare proved result by Turing states that checking whether a given 
program ever halts is undecidable: no algorithm solves this problem in all 
cases regardless of runtime. Yet software developers solve this problem 
during peer code reviews, and so do computer science teachers when grad- 
ing exams in programming courses. 

Worst-case analysis is another limitation of nonphysical limits to com- 
puting, but suggests potential gains through approximation and special- 
ization. For some NP-hard optimization problems, such as the Euclidean 
Travelling Salesman Problem, polynomial-time approximations exist, but 
in other cases, such as the Maximum Clique problem, accurate approxima- 
tion is as hard as finding optimal solutions”. For some important problems 
and algorithms, such as the Simplex algorithm for linear programming, 
few inputs lead to exponential runtime, and minute perturbations reduce 
runtime to polynomial”. 


Conclusions 


The death march of Moore’s law’” invites discussions of fundamental 
limits and alternatives to silicon semiconductors”. Near-term constraints 
(obstacles to performance, power, materials, laser sources, manufactur- 
ing technologies and so on) are invariably tied to costs and capital, but are 
disregarded for the moment as new markets for electronics open up, pop- 
ulations increase, and the world economy grows”. Such economic pressures 
emphasize the value of computational universality and the broad appli- 
cability of integrated circuit architectures to solve multiple tasks under 
conventional environmental conditions. In a likely scenario, only CPUs, 
graphics processing units, field-programmable gate-arrays and dense mem- 
ory integrated circuits will remain viable at the end of Moore’s law, while 
specialized circuits will be predominantly manufactured with less advanced 
technologies for financial reasons. Indeed, memory chips have exemplified 
Moore scaling because of their simpler structure, modest interconnect, 
and more controllable manufacturing, but the miniaturization of mem- 
ory cells is now slowing down’. The decelerated scaling of CMOS inte- 
grated circuits still outperforms the scaling of the most viable emerging 
technologies. Empirical scaling laws describing the evolution of computing 
are well known®®. In addition to Moore’s law, Dennard scaling, Amdahl’s 
law and Gustafson’s law (reviewed above), Metcalfe’s law*! states that the 
value of a computer network, such as the Internet or Facebook, scales as 
the number of user-to-user connections that can be formed. Grosch’s law*” 
ties N-fold improvements in computer performance to N’-fold cost increases 
(in equivalent units). Applying it in reverse, we can estimate the accept- 
able performance of cheaper computers. However, such laws only capture 
ongoing scaling and may not apply in the future. 
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The roadmapping process represented by the ITRS" relies on consensus 
estimates and works around engineering obstacles. It tracks improvements 
in materials and tools, collects best practices and outlines promising design 
strategies. As suggested in refs 17 and 18, it can be enriched by an analysis of 
limits. I additionally focus on how closely such limits can be approached. 
Aside from the historical ‘wrong turns’ mentioned in the ‘Engineering 
obstacles’ and ‘Energy-time limits’ sections above, I uncover interesting 
effects when examining the tightness of individual limits. Although energy- 
time limits are most critical in computer design'**’, space-time limits appear 
tighter and capture bottlenecks formed by interconnect and communica- 
tion. They suggest optimizing gate locations and sizes, and placing gates in 
three dimensions. One can also adapt algorithms to spatial embeddings***° 
and seek space-time limits. But the gap between current technologies and 
energy-time limits hints at greater possible rewards. Charge recovery”, 
power management”, voltage scaling”’, and near-threshold computing”* 
reduce energy waste. Optimizing algorithms and circuits simultaneously 
for energy and spatial embedding® gives biological systems an edge (from 
the ‘one-dimensional’ nematode Caenorhabditis elegans with 302 neurons 
to the three-dimensional human brain with 86 billion neurons)’. Yet, using 
the energy associated with mass (according to Einstein’s E = mc” formula) 
to compute can truly be a ‘nuclear option’—both powerful and contro- 
versial. In a well known 1959 talk, which predated Moore’s law, Richard 
Feynman suggested that there was “plenty of room at the bottom,” fore- 
casting the miniaturization of electronics. Today, with relatively little phys- 
ical room left, there is plenty of energy at the bottom. If this energy is tapped 
for computing, how can the resulting heat be removed? Recycling heat 
into mass or electricity seems to be ruled out by limits to energy conver- 
sion and the acceptable thermal range for modern computers. 

Technology-specific limits for modern computers tend to express trade- 
offs, especially for systems with conflicting performance parameters and 
properties*”. Little is known about limits on design technologies. Given 
that large-scale complex systems are often designed and implemented 
hierarchically” with multiple levels of abstraction, it would be valuable to 
capture losses incurred at abstraction boundaries (for example, the phys- 
ical layout and manufacturing considerations required to optimize and 
build a logic circuit may mean that the logic circuit itself needs to change) 
and between levels of design hierarchies. It is common to estimate resources 
required for a subsystem and then to implement the subsystem to satisfy 
resource budgets. Underestimation is avoided because it leads to failures, 
but overestimation results in overdesign. Inaccuracies in estimation and 
physical modelling also lead to losses during optimization, especially in 
the presence of uncertainty. Clarifying engineering limits gives us the hope 
of circumventing them. 

Technology-agnostic limits appear to be simple and have had signifi- 
cant effects in practice; for example, Aaronson explains why NP-hardness 
is unlikely to be circumvented through physics”. Limits to parallel com- 
putation became prominent after CPU speed levelled off ten years ago. 
These limits suggest that it will be helpful to use the following: faster 
interconnect’*, local computation that reduces communication*’, time- 
division multiplexing of logic®, architectural and algorithmic techniques”, 
and applications altered to embrace parallelism*. Gustafson advocates a 
‘natural selection’: the survival of the applications that are fittest for par- 
allelism. In another twist, the performance and power consumption of 
industry-scale distributed systems is often described by probability distri- 
butions, rather than single numbers”’”’, making it harder even to formu- 
late appropriate limits. We also cannot yet formulate fundamental limits 
related to the complexity of the software-development effort, the efficiency 
of CPU caches”, and the computational requirements of incremental 
functional verification, but we have noticed that many known limits are 
either loose or can be circumvented, leading to secondary limits. For exam- 
ple, the P ¥ NP limit is worded in terms of worst-case rather than average- 
case performance, and has not been proved despite much empirical evidence. 
Researchers have ruled out entire categories of proof techniques as insuf- 
ficient to complete such a proof’*”*. They may be esoteric, but such tertiary 
limits can be effective in practice—in August 2010, they helped researchers 
quickly invalidate Vinay Deolalikar’s highly technical attempt at proving 
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P “NP. On the other hand, the correctness of lengthy proofs for some key 
results could not be established with an acceptable level of certainty by review- 
ers, prompting efforts towards verifying mathematics by computation”. 


In summary, I have reviewed what is known about limits to computa- 


tion, including existential challenges arising in the sciences, optimization 
challenges arising in engineering, and the current state of the art. These 
categories are closely linked during rapid technology development. When 
a specific limit is approached and obstructs progress, understanding its 
assumptions is a key to circumventing it. Some limits are hopelessly loose 
and can be ignored, while other limits remain conjectural and are based 
on empirical evidence only; these may be very difficult to establish rigor- 
ously. Such limits on limits to computation deserve further study. 
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Sequencing studies of breast tumour cohorts have identified many prevalent mutations, but provide limited insight into 
the genomic diversity within tumours. Here we developed a whole-genome and exome single cell sequencing approach 
called nuc-seq that uses G2/M nuclei to achieve 91°% mean coverage breadth. We applied this method to sequence single 
normal and tumour nuclei from an oestrogen-receptor-positive (ER*) breast cancer and a triple-negative ductal carcinoma. In 
parallel, we performed single nuclei copy number profiling. Our data show that aneuploid rearrangements occurred early in 
tumour evolution and remained highly stable as the tumour masses clonally expanded. In contrast, point mutations evolved 
gradually, generating extensive clonal diversity. Using targeted single-molecule sequencing, many of the diverse mutations 
were shown to occur at low frequencies (<10°%) in the tumour mass. Using mathematical modelling we found that the 
triple-negative tumour cells had an increased mutation rate (13.3 x), whereas the ER* tumour cells did not. These findings 
have important implications for the diagnosis, therapeutic treatment and evolution of chemoresistance in breast cancer. 


Human breast cancers often display intratumour genomic heterogeneity’. 
This clonal diversity confounds the clinical diagnosis and basic research 
of human cancers. Expression profiling has shown that breast cancers 
can be classified into five molecular subtypes that correlate with the pre- 
sence of oestrogen, progesterone and Her2 receptors*. Among these, 
triple-negative breast cancers (ER /PR /Her2_ ) have been shown to har- 
bour the largest number of mutations, whereas luminal A (ER*/ PR*/ 
Her2_ ) breast cancers show the lowest frequencies*’. These data sug- 
gest that triple-negative breast cancers (TNBCs) may have increased 
clonal diversity and mutational evolution, but such inferences are dif- 
ficult to make in bulk tissues*”. To gain better insight into the genomic 
diversity of breast tumours, we developed a single cell genome sequen- 
cing method and applied it to study mutational evolution in an ER* 
breast cancer (ERBC) anda TNBC patient. We combined this approach 
with targeted duplex” single-molecule sequencing to profile thousands 
of cells and understand the role of rare mutations in tumour evolution. 


Whole-genome sequencing using G2/M nuclei 
In our previous work we developed a method using degenerate- 
oligonucleotide PCR and sparse sequencing to measure copy number 
profiles of single cells'’. Although adequate for copy number detection, 
this method could not resolve genome-wide mutations at base-pair reso- 
lution. We attempted to increase coverage by deep-sequencing these 
libraries, but found that coverage breadth approached a limit near 10% 
(Fig. 1a). To address this problem, we developed a high-coverage, whole- 
genome and exome single cell sequencing method called nuc-seq (Ex- 
tended Data Fig. 1). In this method we exploit the natural cell cycle, in 
which single cells duplicate their genome during S phase, expanding 
their DNA from 6 to 12 picograms before cytokinesis. This approach 
provides an advantage over using chemical inhibitors to induce poly- 
ploidy in single cells'*’* because it does not require live cells. 

We input four (or more) copies of each single cell genome for whole- 
genome-amplification (WGA) to decrease the allelic dropout and false 


positive error rates, which are major sources of error during multiple- 
displacement amplification (MDA)'*”*. Additionally, we limit the MDA 
time to 80 min to mitigate false positive (FP) errors associated with the 
infidelity of the 629 polymerase (Methods). The improved amplifica- 
tion efficiency can be shown using 22 chromosome-specific primer pairs 
for PCR (Extended Data Fig. 2). In G1/G0 single cells we find that only 
25.58% (11/43) of the cells show full amplification of the chromosomes, 
whereas G2/M cells have 45.34% (39/86). After MDA, we incubate the 
amplified DNA with a Tn5 transposase, which simultaneously fragments 
DNA and ligates adapters for sequencing"*. The libraries are then multi- 
plexed for exome capture or used directly for next-generation sequencing. 


Method validation in a monoclonal cancer cell line 


To validate our method we used a breast cancer cell line (SK-BR-3) that 
was previously shown to be genetically monoclonal’. We evaluated 
the genetic homogeneity of this cell line using spectral karyotyping and 
found that large chromosome rearrangements were highly stable in 
85.80% of the single cells (Supplementary Table 1). We also performed 
single nucleus sequencing (SNS)'*"* on 50 single SK-BR-3 cells and cal- 
culated copy number profiles at 220 kilobase (kb) resolution, which 
showed that the major amplifications of MET, MYC, ERBB2, BCAS1 
and a deletion in DCC were stable (mean R” = 0.91) in all of the 50 cells 
(Fig. 1b). Next, we deep-sequenced the SK-BR-3 cell population (SKP) 
at high coverage depth (51) and breadth (90.40%) and detected single- 
nucleotide variants (SNVs), copy number aberrations (CNAs) and struc- 
tural variants (SVs) using our processing pipeline (Methods). We filtered 
the variants using dbSNP135 and identified 409 non-synonymous var- 
iants and 1,452 structural variants (Fig. 1d), several of which occurred in 
cancer genes (Supplementary Table 2). 

We applied nuc-seq to sequence the whole genomes of two single 
SK-BR-3 cells (SK1 and SK2) and calculated coverage depth, breadth 
(sites with at least one read) and uniformity (evenness). We found that 
both SK-BR-3 cells achieved high coverage depth (61 + 5s.e.m.,n = 2) 
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Figure 1 | Method performance in a monoclonal cell line. a, Coverage 
breadth for single cells (SK1, SK2) sequenced by nuc-seq, a single cell SNS 
library and a SK-BR-3 population (SKP) sample. b, Heatmap of 50 single cell 
SK-BR-3 copy number profiles. c, Lorenz curve of coverage uniformity for the 
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and breadth (83.70 + 3.40% s.e.m., n = 2) (Fig. le). In comparison, we 
re-analysed coverage breadth in single cells sequenced by MALBAC” 
using unique reads and calculated 69.54% coverage breadth. We eval- 
uated coverage uniformity using Lorenz curves” which showed highly 
uniform coverage, representing a major improvement over our previous 
SNS method" and is equivalent to the MALBAC data”? (Fig. 1c). Next, 
we calculated error rates, including the allelic dropout rate (ADR) and 
false positive rate (FPR) by comparing single cell variants to the popu- 
lation data (Methods). Our analysis suggests that nuc-seq generates low 
allelic dropout rates (9.73 + 2.19%) compared to previous studies (7- 
46%)'*. We also achieved low false positive error rates for point muta- 
tions (FPR = 1.24 X 107°), equivalent to 1-2 errors per million bases, 
which represents a major technical improvement over previous methods’*” 
(FPR = 2.52 X10 °and4xX 10 °). 


Population and single nuclei sequencing of an ERBC 


We selected an invasive ductal carcinoma from an oestrogen-receptor 
positive (ER*/PR*/Her2~ ) breast cancer patient for population and 
single cell sequencing (Fig. 2a, Methods). We flow-sorted millions of 
nuclei from the aneuploid G2/M peak (6N) and from matched normal 
tissue for population sequencing (46 X and 54x) (Fig. 2b). We also flow- 
sorted 50 single nuclei for copy number profiling, 4 nuclei for whole- 
genome sequencing and 59 nuclei for exome sequencing. After filtering 
germline variants, we identified a total of 4,162 somatic SNVs in the aneu- 
ploid tumour cell population. Among these SNVs we identified 12 non- 
synonymous mutations, which we validated by exome sequencing (66%). 
Several non-synonymous mutations occurred in cancer genes, including 
PIK3CA, CASP3, FBN2 and PPP2R5E (Fig. 2c, Supplementary Information). 
PIK3CA is the most common driver mutation in luminal A breast cancers”. 

To investigate copy number diversity, we performed single nucleus 
sequencing'"”’* on 50 single nuclei. We constructed a neighbour-joining 
tree, which showed that single tumour cells shared highly similar CNAs 
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single SK-BR-3 cells sequenced by nuc-seq, a cell sequenced by SNS, a 
population of SK-BR-3 cells, and a cell sequenced by MALBAC. d, Circos plot 
of variants detected by sequencing populations of SK-BR-3 cells. e, Coverage 
depth for the SK-BR-3 population sample and the SK1 and SK2 single cells. 


(mean R* = 0.89), representing a monoclonal population (Fig. 2d, Ex- 
tended Data Fig. 3a). Next, we performed whole-genome sequencing of 
four single tumour nuclei at high coverage breadth (80.79 + 3.31% 
s.e.m., 1 = 4) and depth (mean 46.75 + 5.06 s.e.m., 1 = 4). From this 
data we identified three classes of mutations: (1) clonal mutations, 
detected in the population sample and in the majority of single tumour 
cells; (2) subclonal mutations, detected in two or more single cells, but 
not in the bulk tumour; and (3) de novo mutations, found in only one 
tumour cell. The de novo mutations are difficult to distinguish from 
technical errors and were therefore excluded from our initial analysis. 
In total we detected 12 clonal non-synonymous mutations and 32 sub- 
clonal mutations (Fig. 2e). Many subclonal mutations occurred in inter- 
genic regions; however, two mutations (MARCH11 and CABP2) were 
found in coding regions (Supplementary Table 4). 

To identify additional subclonal mutations, we performed single nuclei 
exome sequencing on a larger set of cells (47 tumour cells and 12 normal 
cells). Each nucleus was sequenced at 46.78 (46.78 + 4.95, s.e.m.,n = 59) 
coverage depth and 92.77% (92.77 + 4.85, s.e.m.,n = 59) coverage breadth, 
from which somatic mutations were detected (Supplementary Table 5). 
The mutations were clustered and sorted by frequency to construct a 
heatmap (Fig. 2f). As expected, the 17 clonal mutations identified by 
population sequencing were present in many of the single tumour cells, 
however, we also identified 22 new subclonal mutations. In contrast, 
only a single subclonal mutation was detected in the 12 normal cells 
(Fig. 2f, right panel). 


Population and single nuclei sequencing of a TNBC 

Wethen proceeded to analyse a triple-negative (ER /PR /Her2 ) breast 
cancer (TNBC) (Fig. 3a). We performed population sequencing of the 
bulk tumour (72) and matched normal tissue (74%), and identified 
374 non-synonymous mutations. A number of mutations occurred in 
cancer genes, including PTEN, TBX3, NOTCH2, JAK1, ARAF, NOTCH3, 
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Figure 2 | Single cell and population sequencing of an ER tumour. a, Frozen 
ER tumour specimen. b, Flow-sorting histogram of ploidy distributions. 

c, Circos plot of mutations and CNAs detected in the population of aneuploid 
tumour cells. Cancer genes are on the outer ring. d, Neighbour-joining tree 
of integer copy number profiles from single diploid and aneuploid cells, rooted 


MAP3K4, NTRK1, AFF4, CDH6, SETBP1, AKAP9, MAP2K7, ECM2 and 
ECM1 (Supplementary Table 6) (Fig. 3b). Many of these mutations were 
previously reported in the TCGA breast cancer cohort’. Pathway analysis 
revealed two major pathways that were disrupted during tumour evolu- 
tion: TGF-B (P = 9.9 X 10” *) and extracellular matrix-receptor signalling 
(P=2.7X10 *). Copy number profiling identified many chromo- 
somal deletions, in addition to a focal amplification on chromosome 
19p13.2 (Fig. 3b). 

Toinvestigate genomic diversity at single cell resolution, we performed 
copy number profiling and exome sequencing. We flow-sorted 50 single 
nuclei from the hypodiploid (H), diploid (D) and aneuploid (A) ploidy 
distributions for copy number profiling using SNS (Fig. 3c). Neighbour- 
joining revealed two distinct subpopulations of tumour cells (A and H) 
in addition to the normal diploid cells (Fig. 3d). The single cell copy num- 
ber profiles were analysed using clustered heatmaps, which showed highly 
similar rearrangements within each subpopulation (A mean R? = 0.91, 
H mean R’ = 0.88), but were distinguished by two large deletions on 
chromosome 9 and 15 (Extended Data Fig. 3b). 

Next, we flow-sorted 16 single tumour nuclei from the G2/M peaks 
(Hand A) and 16 single normal nuclei for exome sequencing using nuc- 
seq (Fig. 3e). Non-synonymous point mutations were used to perform 


[Reference allele fm Single cell mutation jm Population mutation 


by the diploid node. e, Circos plots of whole-genome single cell sequencing data 
showing mutations detected in two or more cells. f, Heatmap of coding 

mutations detected by single-nuclei exome sequencing. Mutations detected by 
whole-genome sequencing (pop) and exome sequencing (ex) are also displayed. 


hierarchical clustering and multi-dimensional scaling (MDS). As expected, 
the 374 clonal non-synonymous mutations detected by bulk sequencing 
were found in the majority of the single tumour cells, however, we also 
identified 145 additional subclonal non-synonymous mutations that were 
not detected in the bulk tumour (Supplementary Table 7). MDS identi- 
fied 4 distinct clusters, corresponding to three tumour subpopulations 
(H, A; and A;) and the normal cells (Extended Data Fig. 5a). Hierarchical 
clustering showed that many of the subclonal mutations occurred exclu- 
sively in one subpopulation (H, A, or A) (Fig. 3e). The A; subpopulation 
contained 66 unique subclonal non-synonymous mutations, including 
AURKA, SYNE2 and PPP2R1A. The A, subpopulation contained 52 
unique subclonal non-synonymous mutations including TGFB2 and 
CHRMS. In contrast only two subclonal mutations were shared between 
the normal cells (Fig. 3e, right panel). Many of the subclonal mutations 
(23.44%) were predicted to damage protein function by both POLYPHEN™ 
and SIFT’ (Extended Data Fig. 5b). 


Single-molecule targeted deep sequencing 


To validate the mutations detected by single cell sequencing and de- 
termine their frequencies in the bulk tumour, we performed targeted 
single-molecule deep-sequencing. Duplex libraries were constructed 
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cancer. a, Frozen TNBC specimen. b, Circos plot of mutations and CNAs 
detected by population sequencing of the TNBC, with cancer genes on the outer 
ring. c, Flow-sorting histogram of ploidy distributions, showing three major 
subpopulations: diploid (D), hypodiploid (H) and aneuploid (A). 


from bulk tissue to reduce the error rate of next-generation sequencing”. 
Custom capture platforms were designed to target mutations detected in 
the single cells of the ERBC and TNBC tumours (Methods). Targeted 
deep-sequencing (116,952 x) was performed in the ER tumour result- 
ing in a single-molecule coverage depth of 5,695 using single-strand 
consensus sequences (SSCS). Deep-sequencing of the TNBC (118,743 x) 
resulted in a single-molecule coverage depth of 6,634 using SSCS (Ex- 
tended Data Fig. 4). We found that 61.5% of the reads were in the target 
regions in the ERBC and 80.2% in the TNBC. 

The ERBC duplex data validated 94.44% (17/18) of the clonal muta- 
tions, 90.47% (19/21) of the subclonal mutations, and 19.40% (26/134) 
of the de novo mutations detected by single cell sequencing (P < 0.01) 
(Methods). The clonal mutations occurred at high frequencies in the 
tumour mass, whereas the subclonal mutations (0.0895 mean) and de 
novo mutations (0.0195 mean) were very rare (Fig. 4a). Similarly, in the 
TNBC we validated 99.73% (374/375) of the clonal mutations, 64.83% 
(94/145) of the subclonal mutations and 26.99% (152/563) of the de novo 
mutations (P< 0.01) (Methods). Similarly, we found that the clonal 
mutations in the TNBC showed high frequencies (0.4457 mean), how- 
ever, the subclonal mutations were less prevalent (0.050 mean) and the 
de novo mutations were very rare (0.00047 mean) (Fig. 4b). This data sug- 
gests that many of the subclonal and de novo mutations are likely to be 
real biological variants that occur at low frequencies in the tumour mass. 


Mathematical modelling of the mutation rates 

To estimate the mutation rates in each tumour, we used the single cell 
mutation frequencies and designed a mathematical stochastic birth-and- 
death process model that uses experimentally derived parameters for cell 
birth rates (Ki-67 staining), cell death rates (caspase-3 staining), total 
tumour cell numbers (flow-sorting cell counts) and the tumour mass 
doubling time for invasive carcinomas (mean = 168 days)”*-** (Methods). 
We modelled data for a series of mutation rates and compared the data to 
the empirical single cell mutation frequency distributions (Supplementary 
Table 8). Our data suggest that the ERBC had a mutation rate of Mp = 0.6 
mutations per cell division for the exome data (Fig. 4c) and Mp = 0.9 for 
the single cell whole-genome data (Fig. 4d). These data are similar to 
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by the diploid node. e, Clustered heatmap of the nonsynonymous point 
mutations detected by single nuclei exome sequencing and population 
sequencing (P). Mutations detected in one cell are excluded. 


the error rates reported for normal cells, which are approximately 0.6 
mutations per cell division (error rate = 1 X 10 '°)?*?8, In contrast, our 
modelling suggests a mutation rate of Mp = 8 for the TNBC, suggesting 
a 13.3X fold increase relative to normal cells (Fig. 4e). 


Discussion 


In this study we report the development of a novel single cell genome 
sequencing method that utilizes G2/M nuclei to achieve high-coverage 
data with low error rates. Although G2/M nuclei were used in this study, 
the experimental protocol can also be used to sequence nuclei at any 
stage of the cell cycle. We applied nuc-seq to delineate clonal diversity 
and investigate mutational evolution in two breast cancer patients. Our 
data clearly show that no two single tumour cells are genetically ident- 
ical, calling into question the strict definition ofa clone. In both patients 
we observed a large number of subclonal and de novo mutations. These 
data suggest that point mutations evolved gradually over long periods 
of time, generating extensive clonal diversity (Fig. 4f, g). In contrast, the 
single cell copy number profiles were highly similar, suggesting that 
chromosome rearrangements occurred early, in punctuated bursts of 
evolution, followed by stable clonal expansions to form the tumour 
mass (Fig. 4h, i). 

Wepreviously reported punctuated copy number evolution by sequen- 
cing single cells from a TNBC patient’’. This model has also been sup- 
ported by bulk sequencing data in prostate cancer” and in rearrangement 
patterns called firestorms” or chromothripsis*'. A punctuated model is 
consistent with the mechanisms that underlie CNAs, including chromo- 
some missegregation”’, cytokinesis defects and breakage-fusion-bridge”’, 
which can generate complex rearrangements in just a few cell divisions. 
In contrast, point mutations occur through defects in DNA repair or 
replication machinery”, which accumulate more gradually over many 
cell divisions. Our data are consistent with these mechanisms, and fur- 
ther show that two distinct molecular clocks were operating at different 
stages of tumour growth (Extended Data Fig. 6). 

A pervasive problem in the field of single cell genomics is the inab- 
ility to validate mutations that are detected in single cells. To address this 
problem, we combined single cell sequencing with targeted single-molecule 
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deep-sequencing. This approach not only validates mutations, but also 
measures the precise mutation frequencies in the bulk population. 
Using this approach, we identified hundreds of subclonal and de novo 
mutations that were present at low frequencies (<10%) in the tumour 
mass. These rare mutations may have an important role in diversifying 
the phenotypes of cancer cells, allowing them to survive selective pres- 
sures in the tumour microenvironment, including the immune system, 
hypoxia and chemotherapy>”**. 

A salient question in the field of chemotherapy is whether resist- 
ance mutations are pre-existing in rare cells in the tumour, or alter- 
natively, emerge spontaneously in response to being challenged by the 
therapeutic agent. Although this question has been studied for dec- 
ades in bacteria’’, it remains poorly understood in human cancers. 
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d, ERBC whole-genome single nuclei and modelling data at 0.9 mutation rate. 
e, TNBC single nuclei exome and modelling data at a mutation rate of 8. 
f, Mutation frequencies shared by 2 or more cells in the ERBC. g, Mutation 
frequencies shared by 2 or more cells in the TNBC. h, CNAs shared by two or 
more cells in the ERBC. i, CNAs shared by two or more cells in the TNBC. 


Our data suggest that a large number of diverse mutations are likely to 
be pre-existing in the tumour mass before chemotherapy. Our data also 
has important implications for the mutator phenotype, which posits 
that tumour evolution is driven by increased mutation rates****. Although 
TCGA studies*™ report increased mutation frequencies, it remains 
unclear whether these mutations accumulate over many cell divisions 
(at a normal error rate) or through an increased mutation rate. Our 
TNBC data suggest an increased mutation rate (13.3) relative to the 
normal cells, supporting this model. 

We expect that single cell genome sequencing will open up new ave- 
nues of investigation in many diverse fields of biology. In cancer research 
there will be immediate applications for studying cancer stem cells and 
circulating tumour cells. In the clinic, these tools will have important 
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applications in early detection and non-invasive monitoring. Beyond 
cancer, these tools will have utility in microbiology, development, im- 
munology and neuroscience and will lead to substantial improvements 
in our fundamental understanding of human diseases. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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Crystal structure of the human 


COP9 signalosome 
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Eric S. Fischer)? & Nicolas H. Thoma? 


Ubiquitination is a crucial cellular signalling process, and is controlled on multiple levels. Cullin-RING E3 ubiquitin ligases 
(CRLs) are regulated by the eight-subunit COP9 signalosome (CSN). CSN inactivates CRLs by removing their covalently 
attached activator, NEDD8. NEDD8 cleavage by CSN is catalysed by CSN5, a Zn”* -dependent isopeptidase that is inactive 
in isolation. Here we present the crystal structure of the entire ~350-kDa human CSN holoenzyme at 3.8 A resolution, 
detailing the molecular architecture of the complex. CSN has two organizational centres: a horseshoe-shaped ring created 
by its six proteasome lid-CSN -initiation factor 3 (PCI) domain proteins, and a large bundle formed by the carboxy-terminal 
a-helices of every subunit. CSN5 and its dimerization partner, CSN6, are intricately embedded at the core of the helical 
bundle. In the substrate-free holoenzyme, CSN5 is autoinhibited, which precludes access to the active site. We find that 
neddylated CRL binding to CSN is sensed by CSN4, and communicated to CSN5 with the assistance of CSN6, resulting in 


activation of the deneddylase. 


CSN, which was first discovered in Arabidopsis thaliana as a repressor 
of constitutive photomorphogenesis’, is a protein complex common to 
all eukaryotes”. CSN regulates CRLs**, a family of ~200 complexes in 
humans implicated in many regulatory processes® that together direct 
~20% of proteasome-mediated protein degradation’. Enzymatically, 
CSN functions as an isopeptidase that removes the ubiquitin-like activator 
NEDD8 from CRLs‘*, but it can also bind deneddylated CRLs and main- 
tain them in an inactive state’. CRLs are composed ofa cullin protein 
backbone on which a RING domain-containing protein (RBX1 or RBX2) 
and a substrate receptor module are bound”’. Ubiquitin-loaded E2 ubi- 
quitin conjugating enzymes are recruited by CRLs using RBX1 or RBX2 
to ubiquitinate substrates recognized by their receptor. CRL activity 
is stimulated by the conjugation of NEDD8 to a conserved lysine resi- 
due in the cullin C-terminal domain’*"’”. CSN has emerged as the sole 
enzyme capable of removing NEDD8 modifications from cullins with 
proficiency***"”. It is exquisitely specific for neddylated CRLs and, unlike 
other isopeptidases, has neither general deubiquitination nor deneddy- 
lation activity*’. 

Human CSN contains eight distinct proteins (designated CSN1-8 
by decreasing molecular weight, from 57 to 22 kDa), all of which are 
required for full enzymatic activity in vitro'®. The essentiality of CSN 
has been demonstrated in model organisms: in A. thaliana, the loss of 
any subunit is lethal in the development of seedlings”; similarly, in mice, 
knockout of CSN2, 3, 5, 6 or 8 is lethal at the embryonic stage’’”*. Several 
CSN subunits in humans show elevated expression in cancer and have 
been implicated in sustaining oncogenic transformation”. 

CSNS provides the catalytic centre for CSN, yet as an isopeptidase it 
is essentially inactive outside the holoenzyme*'*”. This raises intrigu- 
ing questions as to how CSNS is harnessed by CSN and the nature of 
the regulatory mechanism that imposes strict substrate specificity. 

Knowledge of the structure of CSN has been limited to low-resolution 
electron microscopy models of CSN alone and in complex with CRL1 
family members’°”’, and those obtained for the related complexes, the 
19S lid component of 26S proteasome (19S lid)**?* and the eukaryotic 


translation initiation factor (eIF3)”**. Interpretation of these maps 
has been aided by molecular models, which have been determined for 
parts of a few individual proteins***’. Detailed structural studies of 
CSN, required for understanding its unique activity and selectivity towards 
neddylated CRLs have, however, posed a challenge. 

The 3.8 A resolution CSN crystal structure presented here provides 
detailed insight into the molecular architecture of the complex. We find 
CSN captured in the crystal in an inactive state, wherein CSN5 occludes 
its own active site. Binding of a neddylated CRL to the holoenzyme trig- 
gers substantial remodelling of CSN4, 5 and 6, resulting in activation of 
the CSN5 isopeptidase. 


Structure of CSN 

Human CSN, consisting of CSN1, 2, 3, 4, 5, 6, 7a and 8, was co-expressed 
and purified from insect cells (Methods). Crystals of CSN lacking flex- 
ible regions of CSN1 (isoform 2, 1-51), CSN5 (residues 1-11) and CSN7a 
(residues 219-275) were obtained and their structure determined by 
X-ray crystallography. These truncations did not impair holoenzyme 
formation or catalytic activity (Extended Data Fig. la-d, 1). The final 
model includes the two CSN complexes found in the asymmetric unit 
of the crystals, 5,178 amino acids in 16 protomers and two Zn** ions 
(Extended Data Table 1a and Extended Data Figs 2a—g, 3a-I). Six CSN 
proteins (CSN1-4 and CSN7-8) contain a PCI domain, characterized 
by helical repeats followed by a winged-helix (WH) subdomain****” 
(Fig. 1a-d). The other two subunits, CSN5 and CSN6, have MPR1/PAD1 
amino-terminal (MPN) domains, a metalloprotease fold””*°***. Only 
CSN5, however, has a complete active site and binds zinc. All of the 
subunits have C-terminal helical decorations separated by largely struc- 
tured linkers from their core domains (Supplementary Data). 

CSN has overall dimensions of 173 X 142 X 108A (Fig. la-c). The 
complex is governed from two organizational centres (Fig. le): an open 
ring formed by association of the WH subdomains from the six PCI 
proteins (PCI ring), and an elaborate bundle comprising the C-terminal 
ot-helices from each subunit (helical bundle). The PCI and MPN proteins 
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Figure 1 | Overall architecture of CSN. a-c, Cartoon representation of 
CSN in three orientations. d, A schematic representation of the domain 
organization of the CSN proteins. Domain boundaries are indicated. 

e, A flattened schematic representation of the three-dimensional structure of 


form largely distinct subassemblies that are united in the helical bundle. 
The N-terminal helical repeat domains of the PCI proteins (CSN1-4 and 
CSN7-8) radiate from the PCI ring at the base of the complex, the largest 
of which (CSN1-4) form prominent arm-like protrusions (Fig. 1a, e). 
The helical bundle sits across the PCI ring. A heterodimer formed by the 
MPN domains of CSN5 and CSN6 rests on the helical bundle (Fig. 1a, e). 
The MPN dimer (CSN5-CSN6 dimer), helical bundle and PCI ring create 
an intricate three-layered assembly. This overall architecture is shared 
among CSN and its paralogous complexes, the 19S lid and eIF3 (Extended 
Data Fig. 4a-h and Supplementary Discussion). 


Detail of the two organizational centres 

The PCI proteins are organized about an open ring formed by asso- 
ciation of their WH subdomains (Fig. 2a). The short three-stranded 
B-sheets in each WH subdomain are oligomerized edge-to-edge in the 
order CSN7-CSN4-CSN2-CSN1-CSN3-CSN8 to form an 18-stranded 
composite f-sheet at the centre of the complex (Fig. 2b-d). The central 
B-sheet has right-handed curvature and progresses through one incom- 
plete helical turn (~300°), resulting in a horseshoe-shaped appearance 
when viewed down the B-strand axis (Fig. 2b, c, Extended Data Fig. 5a-g 
and Supplementary Data). 

The helical bundle lies over the PCI ring at an angle of ~110° from 
the plane of the B-sheet (Fig. 3a). CSN6 forms a U-shaped structure at 
the centre of the bundle with its three C-terminal helices (helices I-III) 
that interacts with every other subunit (Fig. 3b, c). The helices from CSN1, 
2, 3 and 8 surround CSN6 helix III. The 80-A-long helix from CSN7 
(helix I) contacts CSN6 helices I and II at the base of the bundle, nearest 
the PCI ring. The two helices from CSN4 (helices I and II) form a brace 
roughly perpendicular to the bundle axis in contact with the three C- 
terminal helices of CSN6. CSN5, whose two C-terminal helices form an 
antiparallel hairpin, inserts its final C-terminal helix (helix II) into the 
central CSN6 framework at the core of the bundle. Deletion of the C- 
terminal helices has a pronounced effect on CSN integrity*® (Extended 
Data Fig. 6a—g and Supplementary Data). 
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CSN. The WH subdomains forming the PCI ring are shown as white rings. 
Beyond the two organizational centres and the CSN5-CSN6 dimer, 
interactions are formed between CSN3 and the CSN8 N-terminal repeats, the 
CSN7 N-terminal repeat and the helical bundle (see Extended Data Fig. 5f, g). 


CSN5-CSN6 heterodimer 


The MPN domains of CSN5 and 6 form an intimate dimer with pseudo- 
two-fold symmetry using an interface that buries ~900 A? of surface 
area (Fig. 4a, b). Although CSN5 and 6 share sequence and structural 
similarity, the catalytic and Zn” * -coordinating residues are only pres- 
ent in CSN5 (Fig. 4b, c; see later). The CSN5 and 6 chains, which are 
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Figure 2 | PCI ring assembly. a, Cartoon representation of CSN (grey) with 
the PCI ring highlighted in colour. b, Close-up of a. c, Alternative view of 

b, illustrating the opening of the ring. d, Schematic representation of the 
composite B-sheet. Recurrent hydrogen bonding interactions between WH 
units are shown with dashed lines. 
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Figure 3 | Helical bundle assembly. a, Cartoon representation of CSN (grey) 
highlighting the helical bundle formed by the C-terminal helices of every 
subunit in colour. b, Close-up of a. The C-terminal helices are numbered with 
roman numerals as in Fig. 1d. Disordered residues are represented by dashed 
lines. c, An alternative view of b. 


topologically knotted, cross over each other before entering the helical 
bundle (Extended Data Fig. 6h), where they are also closely associated. 
Deletion of the CSN6 MPN domain, leaving its C-terminal helices to 
maintain complex integrity (Extended Data Fig. 6g), results ina CSN 
mutant that is severely catalytically impaired, exhibiting a 100-fold decrease 
in the turnover rate constant (k,,) relative to wild-type CSN (Extended 
Data Fig. le, 1). These observations point to a role for the CON6 MPN 
domain in stabilizing the structure of the CSN5 MPN domain. 
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Figure 4 | CSN5 autoinhibition within CSN. a, Cartoon representation of 
CSN (grey) highlighting the CSN5-CSN6 dimer. b, Close-up of a. c, CSN5 
active site. d, Docking of an isopeptide-linked neddylated CRL (yellow) into the 
CSN5 active site. The CRL-NEDD8 isopeptide bond (Protein Data Bank (PDB) 
accession 3DQV") was fitted in the CSN5 active site based on the di-ubiquitin- 
bound structure of AMSH-LP (PDB accession 2ZNV“’). The side chain of 
Glu 104 coordinates the Zn?* ion and blocks access to the active site, 
autoinhibiting CSN5. e, Model of a catalytically competent state of the CSN5 
active centre based on AMSH-LP and thermolysin. f, Superposition of the Ins-1 
loop from ¢ and e. 
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Integration of CSN5 in CSN 


CSN provides the essential framework for CSN5 to function as an iso- 
peptidase. Despite its entangled structure, the order of CSN assembly 
seems to be surprisingly lenient. CSN5 can enter an otherwise com- 
plete seven-subunit CSN and yield a holoenzyme capable of catalysis 
(Extended Data Fig. 7a, b). Because catalytic activity is directly propor- 
tional to the fraction of CSN complexes that have CSN5 included, we 
asked whether the absence of a given subunit prevents CSN5 integra- 
tion (Extended Data Fig. 7c, d). Although the absence of CSN8 or 3 was 
tolerated, excluding CSN1, 2, 4, 6 or 7 strongly disfavoured CSN5 incor- 
poration. Assembly was completely disrupted by the omission of full- 
length CSN6, emphasizing its crucial structural role in the formation of 
the helical bundle. 


CSNS has an autoinhibited state in CSN 


In CSN, the CSN5 active site Zn” ion is tetrahedrally coordinated by 
the side chains of His 138, His 140 and Asp 151 of the canonical JAB1/ 
MPN/MOV34 (JAMM) motif, and the side chain of Glu 104 (Fig. 4c). 
Glu 104 is situated in the MPN domain insertion-1 loop segment (Ins-1), 
which is essential for substrate recognition in related isopeptidases. While 
Glu 104 is liganded to the Zn’ ion, Ins-1 occludes the entire CSN5 active 
site (Fig. 4d). In CSN, CSN5 Glu 104 replaces the water molecule that 
acts as the nucleophile in the hydrolysis of the isopeptide bond in related 
isopeptidases”. This water is positioned and polarized by an essential 
acidic residue, which is Glu 76 in CSN5 (Fig. 4c). Mutating Glu 76 in 
CSN5 to Ala (CSN (CSN5(E76A))) inactivates CSN (Extended Data 
Fig. 1a). This is analogous to the inactivating mutation first described 
for the MPN domain protease from Archaeoglobus fulgidus, A}AMM”*, 
suggesting a shared catalytic mechanism. Although the overall enzy- 
matic mechanism appears to be conserved among MPN proteases, the 
active site of CSN5 in the crystal is not configured for catalysis (Fig. 4c, d). 
The Ins-1 conformation observed in the holoenzyme differs from that 
of the crystal structure of CSN5 alone (Extended Data Fig. 8a-c)** and 
other MPN proteases (AMSH-LP, RPN11)*°*?*”, which typically have 
the catalytic water coordinated to the active site Zn** ion. For CSN 
activation: (1) the Glu 104 ligand must be removed from the Zn** ion; 
(2) Ins-1 has to change conformation to position the substrate polypep- 
tide; and (3) Glu 76 needs to orient towards the Zn’~ ion to activate the 
catalytic water (Fig. 4e, f; see Supplementary Discussion and Extended 
Data Fig. 8d—f). Hence, a mechanism must exist to trigger remodelling 
of CSN5 and activation of CSN. As discussed later, this conformational 
trigger appears to be binding of a neddylated CRL substrate. 


Substrate-induced structural dynamics 


To study how CSN interacts with a substrate, we examined the negative- 


stain electron microscopy structure of an activated CRL, “Crs 


(NEDD8-CUL1-RBX1-SKP1-SKP2-CKS1) in complex with CSN (CSN- 
ngSCF°*P7/KS!)0. The CSN structure has good agreement with the 
CSN-ygSCF**??/CXS! electron microscopy map when fitted as a single 
rigid body (correlation coefficient of 0.77). Rigid body movement of the 
CSN4 helical repeats and the CSN5-CSN6 MPN dimer improved the 
fit (Fig. 5a, b). The model clearly shows the CSN subunits that contact 
nsSCF**?”/CSS!, and reveals interactions extending beyond the loca- 
lized interaction between CSN5 and the neddylated cullin. The concave 
face of CSN2 (helical repeats 2-5) embraces the CUL1 C-terminal arm 
(WHg domain) (Fig. 5a, b), similar to what has been proposed previously’®. 
The SKP2-CKS]1 substrate receptor is positioned within 10-20 A of CSN3/ 
CSN8. Comparing the CSN crystal structure with the CON-\ygSCFS?”SS! 
electron microscopy model reveals a ~35° rotation of the CSN4 N- 
terminal helical repeats relative to its WH subdomain (Fig. 5a—c). The 
conformer in the high-resolution crystal structure of isolated CSN4, deter- 
mined in the process of solving the CSN structure (Fig. 5c and Extended 
Data Fig. 9a, b), closely matches the CSN4 conformation observed in the 
CSN-ngSCF**?”/“¥*! electron microscopy map. The dramatic domain 
motion in CSN4 is enabled by a hinge loop at the end of the helical repeats 
(residues 291-298) (Extended Data Fig. 9a). 
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Figure 5 | The CSN-SCF interaction and CRL1-dependent CSN5 
activation. a, b, Fit of crystallographic models of CSN and NgSCESK?P?! oe 
(PDB accessions 1LDK””, 3DQV“ and 2ASS™) into the CSN-yjgSCFS8?/CKS! 
electron microscopy map (Electron Microscopy Data Bank accession 2173 
(ref. 10)). c, CSN4 conformations determined in isolation (mauve) (Extended 
Data Fig. 9a) and in the holoenzyme (purple) orientated as in a shown with 
CSN5-CSN6 dimer. d, Conformational changes in CSN4 are expected to 
impact the CSN5-CSN6 dimer. Close-up of the boxed region in c showing the 
portion of the CSN6 Ins-2 loop (green) in contact with CSN4. Scissors indicate 
the region removed in the CSN6™°°? mutant (residues 174-179). 


Finding evidence for substrate-induced conformational changes in 
CSN4 led us to examine its functional relevance. In the substrate-free 
CSN holoenzyme, CSN4 contacts the MPN domain of CSN6 through 
an extended interface involving a conserved -hairpin loop (CSN6 resi- 
dues 172-182; the insertion-2 (Ins-2) region in MPN domain proteins) 
(Fig. 5c, d). A conformational change in CSN4 following CRL binding 
would impact the CSN5-CSN6 dimer. Indeed, we observe considerable 
movement of the CSN5-CSN6 dimer in the CSN-ygSCFPPY”! elec- 
tron microscopy envelope away from the helical bundle towards the 
neddylated cullin when compared with CSN in the crystal. A CSN6 dele- 
tion mutant lacking residues 174-179 of the Ins-2 loop that integrated 
stably into CSN (CSN (CSN6“"°P)) (Fig. 5d and Extended Data Fig. 1f) was 
used to probe the function of the CSN4—CSN6 interface. CSN (CSN6“P) 
had a k,at 4.5-fold higher than wild-type CSN, but an indistinguishable 
Michaelis constant (K,,) value (Extended Data Fig. 1f, 1). Thus, mutat- 
ing the CSN4-CSN6 interface appears to remove an inhibitory com- 
ponent, yielding a complex more active than wild type. 

In the absence of a bound neddylated CRL substrate, CSN maintains 
CSNS5 in an autoinhibited state (Fig. 4c, d). Discovering a mutation in 
the CSN4—CSN6 interface that conferred greater activity to CSN prompted 
us to question whether the CSN4-CSN6 interface is part ofa regulatory 
circuit that inhibits CSN5 until a neddylated CRL substrate is bound. 
Wild-type CSN has very limited isopeptidase activity when assayed with 
ubiquitin-rhodamine, a small artificial substrate for deubiquitinases 
(Extended Data Fig. 1h). CSN (CSN6“!°°P), however, cleaves ubiquitin- 
rhodamine robustly with k.: = 0.04 s_';Km = 1.8 UM (Extended Data 
Fig. 1i, m). Activity towards non-CRL substrates was also found for a 
CSN point mutant carrying CSN5 Glu104Ala in the CSN5 autoinhibitory 
loop (Ins-1) (kcat = 0.04 s ';Km = 2.7 uM) (Extended Data Fig. 1j, m). 
The double mutant combining CSN6“*? and CSNS5 Glu104Ala, CSN 
(CSN6“"°°P, CSN5(E104A)), had an additive effect on activity, produc- 
ing a complex with greater catalytic activity on ubiquitin-rhodamine than 
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wild type and either single mutant (ka = 0.2 s_ 1; Kym = 6.3 UM) (Extended 
Data Fig. 1k, m). These results suggest that the CSN isopeptidase is inhib- 
ited by the CSN5 Ins-1 loop bearing Glu 104 and separately through 
the CSN4-CSN6 interface. The CRL substrate-induced conformational 
changes thus provide a mechanism by which non-CRLs are excluded 
from deneddylation. 

We propose that binding of a neddylated CRL sensed by CSN4 facil- 
itates movement of the CSN5-CSN6 dimer towards the neddylated CRL. 
The proximity of NEDD8 and the cullin to the CSN5 autoinhibitory 
Ins-1 loop may then be sufficient to remodel CSN5, leading to activa- 
tion of CSN and deneddylation (Extended Data Fig. 9c). 

Although CSN, eIF3 and the 19S lid share striking structural simi- 
larity, the intricate substrate-induced activation mechanism identified 
here seems to be unique to CSN** (see also Supplementary Discussion 
and Extended Data Fig. 4b-h). 


Concluding remarks 


The structural and functional characterization of CSN and analysis of 
its interaction with a neddylated CRL1 exposes functional roles for parts 
of the holoenzyme: the PCI ring (CSN1-4 and CSN7-8) organizes the 
helical repeat domains, which bind ygSCF**’”/“**" (principally through 
CSN2 (ref. 10)). The helical bundle enables CSN5 to sense the assembly 
state of CSN, favouring its own integration when the complex is other- 
wise fully assembled. Given that CSN5 is inactive in isolation, this ensures 
that the isopeptidase only becomes functional when CSN is equipped 
to bind CRLs and the induced fit mechanisms (provided by CSN4 and 
CSN6) are in place to activate CSN5 in response to a neddylated CRL. 
This interdependence of architecture and function explains why CSN 
acts exclusively on neddylated CRLs and avoids unregulated deubiqui- 
tinase activity. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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The y-secretase complex, comprising presenilin 1 (PS1), PEN-2, APH-1 and nicastrin, is amembrane-embedded protease 
that controls a number of important cellular functions through substrate cleavage. Aberrant cleavage of the amyloid 
precursor protein (APP) results in aggregation of amyloid-B, which accumulates in the brain and consequently causes 
Alzheimer’s disease. Here we report the three-dimensional structure of an intact human y-secretase complex at 4.5 A 
resolution, determined by cryo-electron-microscopy single-particle analysis. The y-secretase complex comprises a horseshoe- 
shaped transmembrane domain, which contains 19 transmembrane segments (TMs), and a large extracellular domain 
(ECD) from nicastrin, which sits immediately above the hollow space formed by the TM horseshoe. Intriguingly, nicastrin 
ECD is structurally similar to a large family of peptidases exemplified by the glutamate carboxypeptidase PSMA. This 
structure serves as an important basis for understanding the functional mechanisms of the y-secretase complex. 


y-Secretase is a membrane-embedded aspartyl protease that cleaves a 
large number of transmembrane substrate proteins within their mem- 
brane-spanning regions, with the cleavage products serving as signal- 
ling molecules'”. This process is known as regulated intramembrane 
proteolysis (RIP)*. Two extensively studied substrates of y-secretase are 
the amyloid precursor protein (APP) and the Notch receptor’. Successive 
cleavages of APP give rise to several amyloid-B peptides, each with dif- 
ferent length*. Aberrant accumulation of an aggregation-prone 42- 
residue amyloid-B (AB42) over a 40-residue product (AB 49) leads to 
formation of amyloid-B plaques in the brain, triggering the develop- 
ment and pathogenesis of Alzheimer’s disease”. Cleavage of the Notch 
receptor results in the release and translocation of its intracellular domain 
into the nucleus’. Abnormal Notch signalling is linked to developmental 
defects and several types of cancer’. 

The y-secretase complex consists of four components: PS1, PEN-2, 
APH-1 and nicastrin, each containing at least one predicted trans- 
membrane segment (TM)°°. Together, these proteins have a molecular 
weight of approximately 170 kilodaltons (kDa), whereas the nicastrin 
ECD has an additional 30-70 kDa of glycosylation’. Presenilin is the 
catalytic component and contains nine TMs*"". Association with PEN- 
2 facilitates an autocatalytic cleavage of presenilin between TM6 and 
M7, producing two fragments known as the amino-terminal fragment 
(NTF) and the carboxy-terminal fragment (CTF)’*”*. APH-1 and nicas- 
trin assemble into a stable subcomplex'*”, which then interacts with the 
CTF of presenilin’*’”. Nicastrin contains a large extracellular domain 
that is thought to be responsible for substrate recruitment’*’’. The central 
role of presenilin in the y-secretase complex is evidenced by the iden- 
tification of over 150 missense mutations’, each derived from a patient 
with Alzheimer’s disease. 

Despite advances in understanding of the functional aspects of y- 
secretase, structural characterization has been extremely slow, owing 
mainly to the daunting challenges of expression and purification of the 
intact y-secretase. The limited structural information on y-secretase is 


restricted to low-resolution images derived from electron microscopy 
analysis”, a nuclear magnetic resonance (NMR) structure of the 
CTF of presenilin’, and a crystal structure of an archaeal homologue 
of presenilin’®. Consequently, there is little mechanistic understanding 
of the y-secretase functions. 

During the past several years, we have made rigorous efforts to pre- 
pare homogeneous, active human y-secretase for structural investiga- 
tion. We attempted cryo-electron-microscopy (cryo-EM) single-particle 
reconstruction by exploiting technological advances in direct electron 
detection and statistical image processing’’”*. Recent applications of this 
rapidly developing technology include near 3 A resolution structures ofa 
mitochondrial ribosome large subunit”’, the 12-fold symmetric F4y9- 
reducing hydrogenase*®, and the fourfold symmetric TRPV1 complex”. 
Despite these advances, near-atomic resolution reconstruction remains 
challenging for smaller, non-symmetric proteins such as human y-secretase. 
In this study, we report a three-dimensional structure of this membrane- 
embedded complex with an overall resolution of 4.5 A, which reveals its 
domain architecture, secondary structural elements, TM arrangement, 
and ECD fold, and provides important functional insights. 


Preparation of the y-secretase complex 

The human APH-1 is encoded by two genes, APH-1A and APH-1B, of 
which APH-1A seems to be more important”. Similarly, human pre- 
senilin has two forms: PS1 and PS2, and PS1 contains the vast majority of 
disease-derived mutations*’. Owing to these considerations, we focused 
our effort on the human y-secretase that comprises PS1, PEN-2, APH- 
laL (the major form of APH-1; referred to hereafter as APH-1) and nicas- 
trin. We initially assembled a systematic effort to examine the expres- 
sion levels of the individual components, select subcomplexes, as well 
as the intact y-secretase complex in four different expression systems: 
bacteria, yeast, insect cells and mammalian cells. We succeeded in tran- 
sient co-expression of all four components of the human y-secretase 
complex in mammalian HEK293F cells. The coding sequences of PS1, 
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PEN-2, APH-1 and nicastrin were individually cloned into our custom- 
designed pMLink plasmid, in which expression of each of the four y- 
secretase components was under a separate promotor control (Extended 
Data Fig. la, b). The resulting pMLink plasmid was transfected into 
HEK293F cells (Fig. 1a). 

To facilitate purification, we used a range of different affinity tags to 
label the N or C termini of the four individual components. The best 
outcome was achieved with a Flag tag at the N terminus of PEN-2. The 
y-secretase-containing membrane fractions of HEK293F cells, extracted 
by the detergent CHAPSO, was purified over an anti-Flag affinity resin 
and further fractionated on a size exclusion column (Fig. 1a, b). The re- 
sulting y-secretase complex exhibited excellent solution behaviour and 
could be easily visualized on SDS-PAGE by Coomassie blue staining, 
free of any major contaminating protein. Importantly, the NTF and CTF 
were clearly visible, suggesting completion of PS1 autoproteolysis in the 
presence of the other three components. By contrast, expression of PS1 
alone yielded the intact, uncleaved protein (Extended Data Fig. Ic). 

Presence of the NTF and CTF is indicative of an active y-secretase 
complex. To examine this, we reconstituted a y-secretase activity assay 
using the substrate APP-C100, which contains the C-terminal 100 amino 
acids of APP™*. Incubation of y-secretase with the substrate in a 1:10 molar 
ratio led to generation of APP intracellular domain (AICD) (Fig. 1c). The 
presenilin-specific inhibitor III-31C (ref. 35), but not DMSO (dimethyl- 
sulphoxide), blocked the cleavage of APP-C100. The level of y-secretase 
activity is similar to what had been reported’*. The same conclusion was 
obtained for y-secretase in the presence of amphipol A8-35 under the 
same buffer condition as used in later cryo-EM analysis (Extended Data 
Fig. 1d). We concluded that the human y-secretase was in an active confor- 
mation. Nevertheless, there is a possibility that, given sample manipula- 
tion, the electron-microscopy structure described below may not represent 
the fully active conformation. 


Cryo-EM analysis of y-secretase 


Initial attempts to image y-secretase in digitonin using an FEI Falcon- 
II direct-electron detector produced a three-dimensional reconstruc- 
tion with a large disc-shaped ‘body’ and a protruding ‘head’, which could 
accommodate the TMs and extracellular domains of y-secretase, res- 
pectively (Extended Data Fig. 2a, d). However, despite sharp contrast in 
the individual particles, this reconstruction showed few internal fea- 
tures. The TMs were not clearly resolved, and the strongest density ap- 
peared at the periphery of the disc-shaped body, which is likely to have 
derived from the detergent digitonin. These results concurred with rela- 
tively poor accuracies in the alignment of the particles as estimated in the 
employed statistical refinement procedure”, and suggested that the dis- 
ordered nature of the detergent molecules and the small size of the com- 
plex precluded correct alignment of the particles. 

To minimize the effect of the disordered detergent on refinement, 
we replaced digitonin with amphipol A8-35 (Extended Data Fig. 2b, 
d). In addition, we also imaged these samples using a Gatan K2 Summit 


a b 
Molecular weight standard (kDa) 
Transfection of pMLink plasmid MW (kDa) aU ” i. 4 ( 1 ) 
into HEK293 cells mNCT 40 669 440 70 13.6 
40 
Purification of membrane fraction 
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Affinity purification by anti-Flag resin PStore4 ie 
PEN2~ 
I-14 (e} 
0 5 10 15 20 
Gel filtration =) Elution volume (ml) 


Figure 1 | Expression and purification of active human y-secretase. a, A 
schematic diagram of the protocol for the expression and purification of the 
intact human y-secretase complex. pMLink is our custom-designed vector for 
simultaneous co-expression of multiple proteins in mammalian cells. b, A 
representative gel-filtration chromatography of human y-secretase. The peak 
fractions were visualized on SDS-PAGE by Coomassie staining. PS1 had been 
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direct-electron detector in single-electron counting mode to achieve 
higher signal-to-noise ratios at the lower spatial frequencies, which are 
crucial for particle alignment. Combined with statistical image clas- 
sification and movie processing”, this approach produced a markedly 
improved map with an overall resolution of 4.5 A (Fig. 2a and Extended 
Data Figs 2c,d and 3). 

At this resolution, 19 TMs were identified, the B-strands in the nicas- 
trin ECD were well-resolved, and side-chain densities started to show for 
portions of the nicastrin ECD and some of the TMs (Fig. 2a). Densities 
for some of the linker sequences between neighbouring TMs were 
improved by further image classification, which led to a map with an 
overall resolution of 5.4 A from a subset of the particles (Extended Data 
Fig. 3b). The overall correctness of the density map and its handedness 
were confirmed by the tilt-pair test” (Extended Data Fig. 4). 


Overall structure of the y-secretase 


The 19 TMs are organized into a horseshoe-shaped structure (Fig. 2b, top 
panel). In contrast to the density for the TMs, the density for the con- 
necting sequences between neighbouring TMs is weak or absent, possibly 
reflecting the disordered nature in these hydrophilic loops. Nevertheless, 
at least seven TMs are connected by strong density (Extended Data Fig. 5), 
suggesting their order of linkage in ‘y-secretase. For ease of discussion, we 
numbered the 19 TMs (Fig. 2b, bottom panel). These TMs exhibit quite 
different lengths, with two connected TMs (TM17 and TM 18) protruding 
halfway into the membrane from the cytoplasmic side (Fig. 2b and 
Extended Data Fig. 5). Two bent TMs (TM6 and TM7) are placed on 
the concave side of the horseshoe, facing the hollow centre. The large, 
empty pocket seems to be poised for binding to some structural element; 
perhaps the substrate protein. 

The distribution of the 19 TMs is uneven, with considerably more 
TMs concentrated on one end of the horseshoe-shaped structure 
(referred to as the ‘thick’ end) than the other end (the ‘thin’ end). In 
the thin end, there are no more than two layers of TMs when viewed 
perpendicular to the membrane (Fig. 2b, bottom panel). By contrast, the 
thick end has at least three layers of TMs. The archaeal homologue of 
PS1, mmPSH, exhibits a relatively complex membrane topology, with 
three layers of TMs in an inactive conformation”’. Assuming all TMs in 
the y-secretase have been identified in the current electron-microscopy 
maps, this analysis suggests that PS1 might be located within the thick 
end of the TM horseshoe. 

There isa large region of well-defined density outside the membrane- 
spanning region, and the density vastly exceeds that of the sequences 
from y-secretase on the intracellular side. Among the four components 
of y-secretase, nicastrin is the only one that has a sizable ECD, and 
most of the extracellular density is thus attributable to nicastrin (Fig. 2). 
Intriguingly, nicastrin ECD is located immediately above the hollow 
centre of the TM horseshoe and interacts closely with the extracellular 
loops of several TMs on both ends of the horseshoe (Fig. 2b). This 
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completely autoproteolysed into NTF and CTF, whereas nicastrin (NCT) 
existed in two forms: immature (iNCT) and mature (mNCT), reflecting 
differences in glycosylation. c, The purified y-secretase was proteolytically 
active against the APP substrate C100. Cleavage of the substrate APP-C100 was 
blocked by the specific inhibitor III-31C. 
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Figure 2 | Overall structure of the human y-secretase complex. a, An overall 
density map for the entire human y-secretase complex. «-Carbon traces are 
shown for some of the TMs and the ECD. The 5.4-A map was used in both aand 
b. The electron-microscopy maps are coloured cyan. b, Overall structure of the 
human y-secretase complex. Structure of the y-secretase is viewed from within 
the plane of lipid membrane (upper panel). The 19 TMs from the four 


organization is consistent with the reported function of substrate re- 
cruitment for nicastrin™. 

Human y-secretase contains four full-length proteins: PS1 (residues 
1-467), PEN-2 (residues 1-101), APH-1 (residues 1-265), and nicastrin 
(residues 1-709). With glycosylation of nicastrin, the predicted molecu- 
lar weight of the intact y-secretase is approximately 230 kDa (ref. 22). 
The observed density accounts for approximately half of the total mo- 
lecular weight of y-secretase, with the 19 TMs accommodating about 
500 residues and nicastrin ECD containing about 650 residues. The lack 
of obvious density for the other sequences is likely to reflect their flexible 
nature, including the 30-70 kDa of oligosaccharides on glycosylated 
residues in the nicastrin ECD. Only 43 of the 181 residues predicted 
to be on the cytoplasmic side of PS1 are hydrophobic, representing 24 
per cent of the total sequences and unlikely to be sufficient for formation 
of a stable structural core. In addition, the extracellular sequences for 
PEN-2 (residues 1-19 and 78-101) are predicted to be hydrophilic and 
flexible. These residues are missing in the current maps. 


Structure of nicastrin ECD 

Nicastrin ECD was previously predicted to conform to the aminopep- 
tidase superfamily fold”*. The relatively high-resolution features in the 
density for nicastrin ECD (Fig. 2a, Fig. 3a and Extended Data Fig. 6) 
prompted us to pursue a model for its domain architecture. To facilitate 
this task, we searched for sequences in the Protein Data Bank (PDB) that 
are homologous to those of nicastrin ECD. One of the matches was the 
glutamate carboxyl peptidase PSMA (PDB code 2XEF (ref. 39)) (Extended 
Data Fig. 7), confirming the earlier prediction®. Of the 218 aligned amino 
acids between PSMA and nicastrin, 52 are identical and 80 are similar. 
Visual inspection of the extracellular electron-microscopy density revealed 
an excellent match to the structure of PSMA”. The conserved topology 
between these two structures allowed tracing of approximately 400 resi- 
dues with side chains and 20 residues as poly-Ala sequences in the nicas- 
trin ECD (Fig. 3b). The remaining unassigned electron-microscopy density 
is relatively poor and can accommodate about 200 residues. The modelled 
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, 
Cavity 19 


Thin end 


Thick end 


components of y-secretase are coloured blue, whereas the ECD of nicastrin is 
shown in green. A cut-through section of the 19 TMs in y-secretase is shown in 
the bottom panel. The TMs form a horseshoe-shaped structure, with more TMs 
concentrated at the thick end. The TMs are numbered arbitrarily, for ease of 

discussion, from 1 to 19. Figures 2a and 3 were prepared using PyMol”, and 

Fig. 2b was made in Chimera”. 


part of nicastrin ECD resembles a dumbbell, with a large lobe anda small 
lobe (Fig. 3b), which can be superimposed with those of PSMA with 
root-mean-squared deviations (r.m.s.d.) of approximately 2.6 and 3.6 A 
over 231 and 111 aligned «-carbon (Cx) atoms, respectively (Fig. 3c). 


Discussion 


The structural homology between nicastrin ECD and the peptidase 
PSMA may not support the possibility that nicastrin could serve as an 
active protease in cells. PSMA is a zinc metalloprotease and the major- 
ity of the residues that coordinate the two zinc ions in PSMA have 
been replaced in nicastrin (Extended Data Fig. 7). Moreover, we have 
been unable to detect any protease activity for nicastrin ECD in vitro 
using a variety of potential substrate proteins under diverse condi- 
tions. Nevertheless, the fact that nicastrin ECD shares a conserved 
fold as PSMA and other peptidases supports the idea that nicastrin 
may be involved in substrate recruitment'*””. Nicastrin ECD seems to 
contain a surface groove approximately 40 A above the lipid mem- 
brane, facing the hollow centre of the TM horseshoe (Fig. 3d). This 
surface groove could be a putative substrate-binding site. Because the 
active site of PS1 is predicted to be located approximately 20 A below 
the surface of the lipid membrane”, the putative substrate-binding 
site is at least 60 A away from the catalytic Asp residues in PS1. As- 
suming the N terminus of the ee cleavage product APP-C99 is 
recognized by this surface groove’, a distance of 60 A can be con- 
veniently spanned by the primary cleavage products of APP-C99; 

AB4o and AB4». Supporting this analysis, Glu 333, which was thought 
to be responsible for substrate binding’*”’, resides at the centre of the 
groove (Fig. 3d). The residue in PSMA that corresponds to Glu 333 of 
nicastrin is directly involved in zinc binding”. 

The lack of side-chain features in the density for the 19 TMs does not 
allow assignment of the four components. The weak density for the 
loops connecting neighbouring TMs further complicates the assign- 
ment task. Nevertheless, we suggest a speculative assignment, in which 
all 9 TMs of PS1 are located within the thick end of the TM horseshoe 
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Small lobe Cytoplasm 


Figure 3 | Structure of the extracellular domain of nicastrin. 

a, Representative cryo-EM density for -strands (left panels) and «-helices 
(right panels) of the nicastrin ECD. The 4.5-A map was used here. b, The overall 
structure of nicastrin ECD closely resembles that of the glutamate carboxyl 
peptidase PSMA”. The atomic model of nicastrin ECD is shown in green. The 
structure of PSMA is shown in grey, displayed in the right panel for 


(Extended Data Fig. 8a). The PS1 homologue mmPSH contains three 
layers of TMs”. Based on the current electron-microscopy density, the 
thick end is the only place in the horseshoe structure with three layers 
of TMs. The putative assignment of TM1 from PS1 was facilitated by 
the bent nature of TM1 in mmPSH”. PEN-2 was shown to be in close 
proximity of the CTF of PS1 (ref. 40) and to directly bind TM4 of PS1 
(refs 41,42), and APH-1 and nicastrin were thought to interact with the 
CTF of PS1 (refs 16, 17); these features are recapitulated in our model. 
In this speculative model, TM6 and TM7 of PS1, which harbour the 
catalytic Asp residues, and TM9, which contains the substrate recog- 
nition sequence, face the hollow centre of the TM horseshoe (Extended 
Data Fig. 8a). This analysis suggests the location of substrate cleavage 
by y-secretase. The two TMs of PEN-2 are likely to be inserted between 
TM7 and TM8 on the cytoplasmic side, leading to a major conforma- 
tional rearrangement of the TMs in PS1 compared to those in mmPSH”® 
and opening of the putative substrate entry site (Extended Data Fig. 8b). 
This analysis might explain why PS1 autoproteolysis only occurs in the 
presence of PEN-2. Despite the charm of this model, we cannot rule out 
the opposing possibility, whereby PS1 is placed in the thin end (Extended 
Data Fig. 8c). After all, it remains to be seen whether mmPSH represents a 
sound structural model for PS1 or whether all TMs in the y-secretase have 
been identified. 

Although the overall resolution of our structure is 4.5 A, the resolution 
for the TMs is lower and thus does not allow modelling of specific side 
chains. Compared to the detergent choice of digitonin, amphipol was clearly 
better in the cryo-EM analysis of y-secretase and helped to improve the 
quality of image reconstruction. The use of amphipol was reported pre- 
viously in at least two cryo-EM studies of membrane proteins’’’. Asa 
new class of surfactants designed to improve the solubility of membrane 
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comparison. c, Structural comparison between nicastrin (green) and PSMA” 
(grey) for the large lobe (top panel) and the small lobe (bottom panel). 

d, Identification of a putative substrate-binding site in nicastrin ECD. A surface 
groove on nicastrin ECD, located 40 A above the lipid membrane, faces the 
hollow centre of the TM horseshoe. Glu 333, which is thought to have an 
important role in substrate recruitment'*”, resides in the groove. 


proteins“, amphipols may prove to be an important tool for future 
electron-microscopy-based investigation of many other membrane 
proteins. 

Recent structural investigations of intramembrane proteases such as 
the prokaryotic homologues of rhomboid*”’, S2P** and presenilin”® 
have provided hints about the functional mechanisms of these mem- 
brane-embedded signalling proteases. In this study, we report the first 
cryo-EM density map of human y-secretase in which individual B- 
strands are clearly separated in nicastrin ECD and 19 TMs form a 
horseshoe-like structure. Our observed structural features are different 
from those that were derived from previous low-resolution electron- 
microscopy studies of y-secretase””*. Our structure marks an import- 
ant step towards elucidating the molecular mechanisms of this key 
enzyme whose aberrant activity engenders Alzheimer’s disease. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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The origin of the local 1/4-keV X-ray flux in both 
charge exchange and a hot bubble 


M. Galeazzi', M. Chiao’, M. R. Collier”, T. Cravens’, D. Koutroumpa‘, K. D. Kuntz?, R. Lallement®, S. T. Lepri’, D. McCammon’, 
K. Morgan’, F. S. Porter’, I. P. Robertson®, S. L. Snowden’, N. E. Thomas’, Y. Uprety', E. Ursino! & B. M. Walsh?+ 


The solar neighbourhood is the closest and most easily studied sam- 
ple of the Galactic interstellar medium, an understanding of which 
is essential for models of star formation and galaxy evolution. Obser- 
vations of an unexpectedly intense diffuse flux of easily absorbed 
1/4-kiloelectronvolt X-rays’, coupled with the discovery that inter- 
stellar space within about a hundred parsecs of the Sun is almost com- 
pletely devoid of cool absorbing gas’, led to a picture ofa ‘local cavity’ 
filled with X-ray-emitting hot gas, dubbed the local hot bubble**. 
This model was recently challenged by suggestions that the emission 
could instead be readily produced within the Solar System by heavy 
solar-wind ions exchanging electrons with neutral H and He in inter- 
planetary space’""’, potentially removing the major piece of evidence 
for the local existence of million-degree gas within the Galactic 
disk’*"*. Here we report observations showing that the total solar- 
wind charge-exchange contribution is approximately 40 per cent of 
the 1/4-keV flux in the Galactic plane. The fact that the measured 
flux is not dominated by charge exchange supports the notion of a 
million-degree hot bubble extending about a hundred parsecs from 
the Sun. 

When the highly ionized solar wind interacts with neutral gas, an 
electron may hop from a neutral to an outer orbital of an ion, in what is 
known as charge exchange. The electron then cascades to the ground 
state of the ion, often emitting soft X-rays in the process’®. The calcula- 
tions of X-ray intensity from solar-wind charge exchange depend on 
limited information about heavy ion fluxes and even more uncertain 


Neutral He trajectories 


He focusing cone 


Figure 1 | The He focusing cone. Modelled interstellar He density (blue is 
low density; red is high density) showing the He focusing cone. Keplerian He 
orbits, Earth’s orbit, and the DXL and ROSAT observing geometries are 

also shown. 


atomic cross-sections. The ‘Diffuse X-rays from the Local galaxy’ (DXL) 
sounding rocket mission’” was launched from the White Sands Missile 
Range in New Mexico, USA, on 12 December 2012 to make an empir- 
ical measurement of the charge exchange flux by observing a region of 
higher interplanetary neutral density (with a correspondingly higher 
charge exchange rate) called the ‘helium focusing cone’. Neutral inter- 
stellar gas flows at about 25 kms ' through the Solar System owing to 
the motion of the Sun through a small interstellar cloud. This material, 
mostly hydrogen atoms but about 15% helium, flows from the Galactic 
direction (longitude /, latitude b) ~ (3°, 16°), placing Earth downstream 
of the Sun in early December’*. The trajectories of the neutral interstel- 
lar helium atoms are governed primarily by gravity, executing hyper- 
bolic Keplerian orbits and forming a relatively high-density focusing 
cone downstream of the Sun about 6° below the ecliptic plane (Fig. 1)’”. 
Interstellar hydrogen, on the other hand, is also strongly affected by ra- 
diation pressure and photoionization: radiation pressure balances grav- 
ity, reducing the focusing effect, while photoionization creates a neutral 
hydrogen cavity around the Sun. 

The early December launch of DXL placed the He focusing cone near 
the zenith at midnight. The 7° field of view was scanned slowly back and 
forth across one side of the cone and more rapidly in a full circle to test 
the consistency of the derived charge exchange contribution in other 
directions and to make measurements of the detector particle back- 
ground while DXL was looking towards Earth (Extended Data Fig. 1). 
Figure 2 shows the ROSAT All Sky Survey 1/4-keV map” (R12 band) 
with the paths of the DXL slow scan (red) and fast scan (white) over- 
plotted. The ROSAT observation of the slow-scan region was performed 
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Figure 2 | The DXL scan path. ROSAT all-sky survey map in the 1/4-keV 
(R12) energy band, shown in Galactic coordinates (contours are labelled in 
degrees) with / = 180°, b = 0° at the centre. The colour scale shows flux 
intensity. The units are ROSAT units, RU. The DXL scan path is the white band 
with the slow portion shown in red. The black line is the 90° horizon for the 
DXL flight. The width of the band represents the half-power diameter of 

the instrument beam. 
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Figure 3 | Neutral atom column density for DXL and ROSAT. Neutral 
column density distribution integrals for each line of sight along the scan path. 
The density distribution in the integrals is weighted by one over the distance 
from the Sun squared (1/R?) to reflect the dilution of the solar wind as it flows 
outward. The red lines are the integral for He (solid) and H (dashed) in the DXL 
geometry. The blue lines represent the integral for He (solid) and H (dashed) in 
the ROSAT geometry. The black line shows the Galactic latitude during the 
scan. DXL is significantly more affected by the He focusing cone, while in both 
cases the H contribution is small. 


in September 1990 when the line of sight was about one astronomical 
unit (the Earth-Sun distance, 1 AU) away from, and parallel to, the He 
cone, so its charge exchange contribution was not strongly affected by 
the cone enhancement (Fig. 1). 

For this analysis, we chose pulse height limits for both of the DXL 
proportional counters (Counter-I and Counter-II) to match the pulse 
heights of the ROSAT 1/4-keV band as closely as possible (Extended 
Data Fig. 2). This energy range is dominated by and contains most of the 
emission from solar-wind charge exchange and/or the local hot bubble. 
To quantify the solar-wind charge exchange emission we compared 
both DXL and ROSAT count rates to well determined models of the 
interplanetary neutral distribution along the lines of sight for both sets 
of measurements (Fig. 3)'’*'. Figure 4 shows the DXL and ROSAT 
count rates along the DXL scan path as functions of Galactic longitude. 
The figure shows the combined Counter-I and Counter-II count rates 
(black dots) during the DXL scan and the ROSAT 1/4-keV count rates 
in the same directions (blue solid line). The best fit to the DXL total count 
rate (red solid line), and the solar-wind charge exchange contributions 
to DXL (red dashed line) and ROSAT (blue dashed line) rates are also 
shown (see Table 1 for best-fit parameters: the model shown corresponds 
to the second column). There is potentially an additional contribution 
from charge exchange between the solar-wind ions and the geocoronal 
hydrogen surrounding Earth, which tracks the short-term variations 
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Figure 4 | Fit to DXLand ROSAT data. Combined Counter-I and Counter-II 
count rates (black dots) during the DXL scan and ROSAT 1/4-keV count rate in 
the same directions (blue solid line). The best fit to the DXL total count rate 
(red solid line), and the solar-wind charge exchange contribution to DXL 
(red dashed line) and ROSAT 1/4-keV bands (blue dashed line) are also shown. 
The error bars are s.e.m. 
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Table 1 | Best-fit model parameters 


ROSAT geocoronal 0 50 ) 50 
solar-wind charge 

exchange (RU) 

ay/oHe 1 dl 2 2 
Counter-| 0.91+0.06 096+006 087+0.07 0.92+0.07 
Counter-ll 0.97+0.07. 102+007 092+0.07 0.99+0.07 
Np(RoWrei%He 5,223+770 3197+720 4,500+490 3180+460 
(RU cm? au) 

DXL/ROSAT 0.63+0.13 0914025 0.734014 0.96+0.20 
solar-wind flux 

7° (136 degrees of 207 196 206 194 
freedom) 


Total solar-wind 
charge exchange 
contribution to 
ROSAT R12 data 


(39 + 6)% (37 +8)% (39 + 4)% (42 + 6)% 


Summary of best-fit parameters for different assumptions for the geocoronal contribution to the ROSAT 
R12 band, and the ratio between hydrogen and helium compound cross-sections ;/a4.. Counter-l and 
Counter-ll are the ratios of the fitted DXL response to the nominal value from laboratory calibrations (the 
corrections are well within the range expected from spectral uncertainties). v,.) is the relative speed 
between solar wind and neutral flow and np is the proton density. For the ratio of solar-wind fluxes 
during the DXL and ROSAT measurements, we note that although both missions were near the solar 
maximum (and therefore should have similar isotropic composition), solar activity in terms of sunspots 
was weaker during the DXL measurement, as reflected by the fitted ratios. The total solar-wind charge 
exchange contribution to ROSAT (interplanetary + geocoronal) is divided by the observed R12 rate and 
given as a percentage of R12 at b = 0° (333 RU, which is the lowest anywhere on the scan and close to 
the lowest on sky). Errors are lo. 


in solar-wind flux. Time variations of a few days or less were removed 
from the ROSAT maps, and the current best estimate of the residual 
from geocoronal charge exchange is about 50 ROSAT units (1 RU = 
10 °countss ‘arcmin *) for the ROSAT 1/4-keV band (K.D.K,, J. 
Carter, M.P.C., Y. M. Colladovega, M.R.C., T.E.C., D.K., F.S.P., A. Read, 
L.P.R.,, D.G. Sibeck, S. F. Sembay, S.L.S., N.E.T. and D.M.W., manuscript 
in preparation). The geocoronal contribution to the DXL flux should be 
negligible, owing to the look direction, which is directly away from the 
Sun. The signature of the cone enhancement in the DXL data compared 
to the ROSAT rates is evident, highlighting the contribution from charge 
exchange. However, the best fit shows that the total charge exchange 
contribution to ROSAT is only about 40% + 5% (statistical error) + 5% 
(systematic error) of the total flux observed at the Galactic plane. Its 
contribution to the ROSAT flux over the DXL scan path is typically about 
140 RU. For comparison, the total ROSAT 1/4-keV flux ranges from 
around 300 RU to 400 RU in the plane up to 1,400 RU in the brightest 
areas at intermediate and high latitudes. This result implies that the mea- 
sured fluxes are dominated by interstellar emission, strengthening the 
original idea of a hot bubble filling the local interstellar medium for a 
hundred parsecs or so in all directions from the Sun. 

It has been pointed out that a hot bubble creates an apparent pres- 
sure balance problem with the tenuous warm cloud that the Sun is pass- 
ing through. However, recent results on the magnetic contribution to the 
cloud pressure” and new three-dimensional maps of the local inter- 
stellar medium” bring the implied pressure of the plasma in the local 
hot bubble to rough agreement with pressures derived for the local in- 
terstellar clouds when the measured contribution from the solar-wind 
charge exchange is removed from the local hot bubble emission”. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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Cohesive forces prevent the rotational breakup of 
rubble-pile asteroid (29075) 1950 DA 


Ben Rozitis', Eric MacLennan’ & Joshua P. Emery’ 


Space missions’ and ground-based observations’ have shown that 
some asteroids are loose collections of rubble rather than solid bodies. 
The physical behaviour of such ‘rubble-pile’ asteroids has been tra- 
ditionally described using only gravitational and frictional forces 
within a granular material’. Cohesive forces in the form of small van 
der Waals forces between constituent grains have recently been pre- 
dicted to be important for small rubble piles (ten kilometres across 
or less), and could potentially explain fast rotation rates in the small- 
asteroid population* *. The strongest evidence so far has come from 
an analysis of the rotational breakup of the main-belt comet P/2013 R3 
(ref. 7), although that was indirect and poorly constrained by obser- 
vations. Here we report that the kilometre-sized asteroid (29075) 
1950 DA (ref. 8) isa rubble pile that is rotating faster than is allowed 
by gravity and friction. We find that cohesive forces are required to 
prevent surface mass shedding and structural failure, and that the 
strengths of the forces are comparable to, though somewhat less than, 
the forces found between the grains of lunar regolith. 

It is possible to infer the existence of cohesive forces within an aster- 
oid by determining whether it is a rubble pile with insufficient self- 
gravity to prevent rotational breakup by centrifugal forces. One of the 
largest known candidates is the near-Earth asteroid (29075) 1950 DA 
(mean diameter of 1.3 km; ref. 8), because it has a rotation period of 
2.1216 h that is just beyond the critical spin limit of about 2.2 h estimated 
for a cohesionless asteroid’. A rubble-pile structure and the degree of self- 
gravity can be determined by a bulk density measurement, which can 
be acquired through model-to-measurement comparisons of Yarkovsky 
orbital drift’®. This drift arises on a rotating asteroid with non-zero ther- 
mal inertia, and is caused by the delayed thermal emission of absorbed 
sunlight, which applies a small propulsion force to the asteroid’s after- 
noon side. Thermal-infrared observations can constrain the thermal 
inertia value’’, and precise astrometric position measurements con- 
ducted over several years can constrain the degree of Yarkovsky orbital 
drift”. Recently, the orbital semimajor axis of (29075) 1950 DA has been 
observed to be decreasing at a rate of 44.1 + 8.5myr_' because of the 
Yarkovsky effect’, which indicates that the asteroid’s sense of rotation 
must be retrograde. Using the Advanced Thermophysical Model’**, 
in combination with the retrograde radar shape model’, archival WISE 
thermal-infrared data’* (Extended Data Table 1, and Extended Data Figs 1 
and 2), and orbital state!”, we determined the thermal inertia and bulk 
density of (29075) 1950 DA (see Methods). The thermal inertia value 
was found to be remarkably low at 24*7{Jm > 7K~'s ‘7, which gives 
a corresponding bulk density of 1.7 + 0.7 gcm ° (Fig. 1 and Extended 
Data Fig. 3). This bulk density is much lower than the minimum value 
of 3.5gcm_ * required to prevent loss of surface material by centrifugal 
forces (Fig. 2). 

Spectral observations of (29075) 1950 DA indicate either an E- or M- 
type classification in the Tholen taxonomic system'*. However, its low 
optical albedo and low radar circular polarization ratio® rule out the 
E-type classification (Extended Data Table 2). The derived bulk den- 
sity is inconsistent with the traditional view that M-type asteroids are 
metallic bodies. However, the Rosetta spacecraft encounter with main- 
belt asteroid (21) Lutetia has demonstrated that not all M-type asteroids 


are metal-rich’’”. Indeed, the low radar albedo® of (29075) 1950 DA is 
very similar to that of (21) Lutetia, suggesting a similar composition. The 
best meteorite analogue for (21) Lutetia is an enstatite chondrite’, which 
hasa grain density of 3.55 gcm *. Taking the same meteorite analogue 
and grain density for (29075) 1950 DA implies a macro-porosity of 51 
+ 19% and indicates that it is a rubble-pile asteroid (Fig. 1). 

Given that the WISE observations were taken when (29075) 1950 DA 
was about 1.7 AU (one astronomical unit, AU, is the distance from Earth 
to the Sun) from the Sun, the derived thermal inertia value scales to 
36+ 33 ym °K 's ‘at 1 Aubecause of temperature-dependent effects. 
This scaled value is comparable to that of the ~45J m7 K's" value 
determined for the lunar surface from thermal-infrared measurements!*, 
and implies the presence of a similar fine-grained regolith. This is con- 
sistent with (29075) 1950 DA’s low radar circular polarization ratio, 
which suggests a very smooth surface at centimetre to decimetre scales’. 
The sub-observer latitude of the WISE observations was ~2°, which 
indicates that this surface material was primarily detected around (29075) 
1950 DA’s equator. 

For the derived bulk density, (29075) 1950 DA has 48 + 24% of its 
surface experiencing negative ambient gravity (that is, surface elements 
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Figure 1 | Physical property distributions of (29075) 1950DA. These were 
derived by the Advanced Thermophysical Model (ATPM) at the 3a confidence 
level by 7’ fitting to the 16 WISE thermal-infrared observations and to the 
observed rate of Yarkovsky orbital drift. The best model fit had a reduced-7’ 
value of 1.06 with a corresponding P value of 0.39. The distributions have 
median values and 1o ranges of 24470 Jm? Kos a7 0.7gcm °, 

51 + 19%, 48 + 24%, (3+ 1) X 10° “gg, and 647 3% Pa for the thermal inertia, 
bulk density, macro-porosity, negative ambient gravity area, peak negative 
ambient gravity, and cohesive strength, respectively. 
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Figure 2 | Degree of negative ambient gravity for (29075) 1950DA. The 
area of the surface experiencing negative ambient gravity (solid line) is plotted 
against the primary (left) y axis, and the peak negative ambient gravity 
(dashed line) is plotted against the secondary (right) y axis. Both are plotted as 
functions of bulk density for the nominal diameter of 1.3 km. The vertical lines 


represent the 1a range derived for the bulk density, that is, 1.7+0.7gcm *. 


where rotational centrifugal forces dominate over self-gravity) with peak 
outward accelerations of (3 + 1) X 10° °gp (where gy is 9.81ms *) 
around the equator (Methods; Figs 1, 2 and 3). This makes the pres- 
ence of a fine-grained regolith unexpected, and requires the existence 
of cohesive forces for (29075) 1950 DA to retain such a surface. In gran- 
ular mechanics, the strength of this cohesive force is represented by the 
bond number B, which is defined as the ratio of this force to the grain’s 
weight. Lunar regolith has been found to be highly cohesive because of 
van der Waals forces arising between grains’’, and experimental and 
theoretical studies have shown that the bond number for this cohesive 
force is given by 


B=10 "g, a" (1) 


where ga is the ambient gravity and d is the grain diameter®. To pre- 
vent loss of surface material requires bond numbers of at least one, but 
surface stability requires the bond numbers to be greater than ten, which 
places limits on the possible grain sizes present. For a peak negative 
ambient gravity of 3 X 10 °gs, this relationship dictates that only grains 
with diameters less than ~6 cm can be present and stable on the aster- 
oid’s surface. 

This upper limit of a diameter of ~6 cm for the grains is consistent 
with (29075) 1950 DA’s lunar-like regolith. In particular, lunar regolith 
has micrometre- to centimetre-sized grains described by an approx- 
imate d~° size distribution™”. The rubble-pile asteroid (25143) Itokawa 
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Figure 3 | Gravitational slopes of (29075) 1950 DA. These were produced 
using the retrograde radar shape model* with the nominal derived bulk 
density of 1.7gcm 7’. Gravitational slopes greater than 90°, which occur 
predominantly around the equator, indicate that those surface elements are 
experiencing negative ambient gravity. 
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Figure 4 | Minimum internal cohesive strength of (29075) 1950 DA. This 

was calculated using the Drucker-Prager failure criterion as a function of bulk 

density (x axis) and angle of friction (shown on the figure) for the nominal 


diameter of 1.3 km. The vertical lines represent the 1o range derived for the bulk 
density, that is, 1.7 + 0.7 g cm >. 


also has a d ® grain size distribution but has boulders ranging up to 
~40 m in size on the surface”’, which is reflected in its much higher 
thermal inertia value of ~750 J m7 K~!s~ 1”? (ref. 21). (29075) 1950 DA 
might have had large boulders present on its surface in the past, but 
these would have been progressively lost in order of size as it was spun- 
up by the YORP effect (that is, spin state changes caused by the aniso- 
tropic reflection and thermal re-emission of sunlight from an irregularly 
shaped asteroid’®). This spinning-up selection process leaves behind the 
relatively fine-grained regolith with low thermal inertia that we infer 
today”, and would operate in addition to the thermal fatigue mech- 
anism of asteroid regolith formation”. 

To check whether internal cohesive forces are also required to pre- 
vent the structural failure of (29075) 1950 DA, we applied the Drucker- 
Prager model for determining the failure stresses within a geological 
material*® (Methods). In this model, the maximum spin rate that a 
rubble-pile asteroid can adopt depends on its overall shape, degree of 
self-gravity and internal strength. The internal strength results from 
the angle of friction between constituent grains and any cohesive forces 
present. Using the dynamically equivalent and equal-volume ellipsoid of 
(29075) 1950 DA, and using an angle of friction typical for lunar regolith 
of 40° (ref. 19), we find that a minimum cohesive strength of 64755 Pa 
is required to prevent structural failure (Figs 1 and 4). This is less than 
that of 100 Pa measured for weak lunar regolith”’, and is within the range 
of 3-300 Pa estimated by numerical simulations of rubble-pile asteroids®. 
It is also consistent with the range of 40-210 Pa estimated for the pre- 
cursor body of the main-belt comet P/2013 R3 (ref. 7). This finding 
proves that not all small asteroids rotating faster than the cohesionless 
critical spin limit are coherent bodies or monoliths*~. It also supports 
the view that some high-altitude bursting meteors, such as the impact- 
ing asteroid 2008 TC3 (ref. 23), are very small rubble piles held to- 
gether by cohesive forces®. 

Finally, given that (29075) 1950 DA has a 1 in 19,800 chance of im- 
pacting the Earth in 2880 (ref. 12), and has the potential to break up like 
P/2013 R3 because of its tensional state, there are implications for im- 
pact mitigation. Some suggested deflection techniques, such as the kin- 
etic impactor”, violently interact with the target asteroid and have the 
potential to destabilize long-ranging granular force networks present”. 
With such tenuous cohesive forces holding one of these asteroids to- 
gether, a very small impulse may result in complete disruption. This 
may have happened to the precursor body of P/2013 R3 through a me- 
teorite impact. Therefore, there is a potential danger of turning one 
Earth-threatening asteroid into several if cohesive forces within rub- 
ble-pile asteroids are not properly understood. 
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METHODS 

Thermophysical modelling. The ATPM was used to determine the thermal in- 
ertia and bulk density of (29075) 1950 DA. The ATPM was developed to interpret 
thermal-infrared observations of planetary surfaces lacking atmospheres’, and si- 
multaneously make asteroidal Yarkovsky and YORP effect predictions’. Accurate 
interpretation of thermal-infrared observations was verified by applying it to the 
Moon", and it has been successfully applied to asteroids (1862) Apollo” and (101955) 
Bennw’ to determine their thermal and physical properties. 

To summarize howit works”*, the ATPM computes the surface temperature var- 

iation for each surface element during a rotation by solving one-dimensional heat 
conduction with a surface boundary condition that includes direct and multiply 
scattered sunlight, shadowing, and re-absorbed thermal radiation from interfacing 
surface elements (that is, global self-heating effects). Rough-surface thermal-infrared 
beaming (that is, thermal re-emission of absorbed solar energy back towards the 
Sun) is explicitly included in the form of hemispherical craters, which have been 
shown to accurately recreate the lunar thermal-infrared beaming effect’. The degree 
of roughness for each surface element is specified by the fraction of its area covered 
by the (rough) hemispherical craters, fp. The asteroid thermal emission as a func- 
tion of wavelength, rotation phase and various thermophysical properties is deter- 
mined by applying the Planck function to the derived temperatures and summing 
across visible surface elements. The Yarkovsky and YORP effects are then determined 
by computing the total recoil forces and torques from photons reflected off and 
thermally emitted from the asteroid surface. 
Analysis of WISE thermal-infrared observations. The thermal inertia of (29075) 
1950 DA was determined using archival WISE thermal-infrared observations, which 
were obtained on 12-13 July 2010 ur (Universal Time) during the WISE All-Sky 
survey’®. All instances of WISE observations of (29075) 1950 DA were taken from 
the Minor Planet Center database and used to query the WISE All-Sky Single Expo- 
sure (L1b) source database via the NASA/IPAC Infrared Service Archive. Search 
constraints of 10’’ within the Minor Planet Center ephemeris of (29075) 1950 DA, 
and Julian dates within 10 s of the reported observations, were used to ensure proper 
data retrieval. The magnitudes returned from this query were kept only in the in- 
stances in which there was a positive object detection or where a 95% confidence 
brightness upper limit was reported. (29075) 1950 DA had a faint apparent visual 
magnitude of 20.5 when the WISE observations were taken, and was only detected 
at 3o levels or greater in the W3 (12 tm) and W4 (22 um) WISE infrared bands. 
Additionally, we used only data points that repeatedly sampled common rotation 
phases of (29075) 1950 DA to ensure consistency, and to avoid outliers, within the 
data set. This resulted in 14 useable data points in the W3 band and 2 useable data 
points in the W4 band (Extended Data Table 1). The WISE images for these data 
points were also retrieved to check for any contaminating sources or extended objects 
surrounding (29075) 1950 DA (see Extended Data Fig. 2). The WISE magnitudes 
were converted to fluxes, and the reported red-blue calibrator discrepancy” was 
taken into account. A 5% uncertainty was also added in quadrature to the reported 
observational uncertainties to take into account additional calibration uncertain- 
ties’’. As in previous works of WISE asteroid observations (for example, ref. 15), 
we colour-correct the model fluxes using the WISE corrections of ref. 27 rather 
than colour-correcting the observed fluxes. 

The free parameters to be constrained by the WISE observations in the model 
fitting include the diameter, thermal inertia, surface roughness and rotation phase. 
Although the radar circular polarization ratio indicates a very smooth surface at 
centimetre-to-decimetre spatial scales®, it does not provide a constraint on surface 
roughness occurring at smaller spatial scales that are comparable to the depth of 
(29075) 1950 DA’s thermal skin (~1 mm). Surface roughness occurring at these spa- 
tial scales induce the thermal-infrared beaming effect, which requires that roughness 
must be left as a free parameter to allow the full range of possible interpretations of 
the WISE thermal-infrared observations to be obtained. In addition, the uncer- 
tainty on (29075) 1950 DA’s measured rotation period does not allow accurate phas- 
ing of the radar shape model between the light-curve observations taken in 2001 
and the WISE observations taken in 2010. Therefore, the rotation phase of the first 
observation (used as the reference) was left as a free parameter in the model fitting. 

In the model fitting, the model thermal flux predictions, Fyop(A» D, I. fe 9): 
were compared with the observations, Fogs(4,), and observational errors, gozs(An); 
by varying the diameter D, thermal inertia J’, roughness fraction fg and rotation 
phase ¢ to give the minimum ¢ fit 


? 3 See ee ‘ 
7 Joss(An) 


(2) 

n=1 
for a set of n = 1 to N observations with wavelength 1,,. Separate thermophysical 
models were run for thermal inertia values ranging from 0 to 1,000Jm~?K~'s 1? 
in equally spaced steps of 20Jm~?K~'s~'”? initially, and then between 0 and 
90Jm *K 's '?in2Jm *K 's ‘steps once the probable thermal inertia value 
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had been constrained. The diameter, roughness fraction and rotation phase were 
also stepped through their plausible ranges, forming a four-dimensional grid of 
model test parameters (or test clones) with the thermal inertia steps. A parameter 
region bounded bya constant Ay’ at the 3c confidence level then defined the range 
of possible parameters. Finally, counting the number of acceptable test clones in 
each parameter value bin then allowed us to obtain the probability distribution for 
each free parameter. The best model fit had a reduced-y’ value of 1.06 with a cor- 
responding P value of 0.39, and an example model fit to the WISE observations is 
shown in Extended Data Fig. 1. 

Unfortunately, the WISE data alone do not place unique constraints on the di- 

ameter, thermal inertia and surface roughness because of its limited phase angle 
and wavelength coverage. As shown in Extended Data Fig. 3, the best-fitting diam- 
eter increases with thermal inertia such that a unique constraint cannot be made. 
Fortunately, the radar observations had constrained (29075) 1950 DA’s diameter 
to be 1.3km with a maximum error of 10% (ref. 8). Therefore, by allowing the 
diameter to vary between 1.17 km and 1.43 km only, we found that (29075) 1950 
DA’s thermal inertia value must be very low, that is, 24+ Jm?K71s~ 12 or 
=82Jm ?K~'s "?. This result is consistent with the preliminary upper bound 
of 110Jm 7K 's ‘determined by ref. 28 using a simpler thermophysical model 
that neglected rough-surface thermal-infrared beaming effects. In our work, the sur- 
face roughness still remains unconstrained, but must be included for the Yarkovsky 
effect analysis described below. The probability distribution for the derived ther- 
mal inertia value is shown in Fig. 1. 
Analysis of Yarkovsky orbital drift. The bulk density of (29075) 1950 DA could 
be determined by model-to-measurement comparisons of its Yarkovsky semima- 
jor axis drift. The authors of ref. 12 were able to measure a transverse acceleration 
of (—6.70 + 1.29) X 10” '° au per day squared acting on (29075) 1950 DA inits orbit 
by using optical astrometry dating back to 1950 and radar ranging data taken in 2001 
and 2012. This transverse acceleration corresponded to a rate of change in semi- 
major axis of (—2.95 + 0.57) X 10 * au per million years (Myr) or —44.1+ 8.5myr * 
(ref. 29). Parameter studies using the ATPM have shown that the Yarkovsky effect 
is in general enhanced by rough-surface thermal-infrared beaming"*. For (29075) 
1950 DA’s oblate shape, fast rotation period, and low thermal inertia, the potential 
enhancement was rather large (see Extended Data Fig. 4) and had to be included to 
prevent underestimation of (29075) 1950 DA’s bulk density. The overall Yarkovsky 
drift acting on (29075) 1950 DA, da/dt(D,/;fp,p); for a bulk density p was deter- 
mined from 


da 


dt 


D da da da 
(Fe) (22) [od SF amen +S Oat + SF Pe 
where da/dt(I’) smooth is the smooth surface component, da/dt(I’) ough is the rough 
surface component, and da/dt(I”),casonal is the seasonal component”. Each com- 
ponent was evaluated separately at a specified initial diameter Dy and bulk density 
po. A Yarkovsky effect prediction was produced for every test clone deemed ac- 
ceptable from the WISE flux-fitting described above. To produce the distribution 
of possible bulk densities, each prediction was compared against 500 samples of 
Yarkovsky drift that were randomly selected from a normal distribution with a mean 
and standard deviation of —44.1 + 8.5 myr_'. This ensured that the uncertainty on 
the measured Yarkovsky drift was taken into account. Extended Data Fig. 3 shows 
the derived bulk density as a function of thermal inertia for the range of acceptable 
test clones without a thermal inertia constraint. As indicated, to match the observed 
drift at the 1a level required the bulk density to be less than 2.7 gcm”° regardless 
of the thermal inertia value. Using the thermal inertia constraint, the bulk density 
was constrained to be 1.7 + 0.7 gcm * and its probability distribution is shown in 
Fig. 1. 

Cohesive force modelling. Surface and internal cohesive forces are required to pre- 
vent surface mass shedding and structural failure of (29075) 1950 DA, respectively. 
The surface cohesive forces are proportional to the magnitude of the negative ambi- 
ent gravity experienced by the surface. In particular, the gravitational acceleration 
gacting ata particular point of (29075) 1950 DA’s surface x was determined using a 
polyhedral gravity field model’®. Under asteroid rotation the ambient gravitational 
acceleration at that surface point will be modified by centripetal acceleration, such 
that 


(DI fr.p) = 
(3) 


paren) 0 


where « is the asteroid angular velocity and is the unit vector specifying the 
orientation of the asteroid rotation pole. The ambient surface gravity gq acting along 


the surface normal ft of point xis then given by 


ga=—gih (5) 
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and, finally, the effective gravitational slope 0 is given by 


0= cos '(ga/Ig'|) (6) 


To accurately measure the area of the asteroid’s surface experiencing negative 
ambient gravity each shape model facet was split into one hundred smaller sub- 
facets. The negative ambient gravity as a function of bulk density is shown in Fig. 2, 
and a three-dimensional plot of gravitational slope is shown in Fig. 3. The bond 
number for a particular regolith grain diameter in a specified degree of negative 
ambient gravity was then determined from equation (1). 

The minimum internal cohesive force required to prevent structural failure of 
(29075) 1950 DA was determined analytically from the Drucker-Prager failure 
criterion and a model of interior stresses within the asteroid*®. For a homogenous 
ellipsoidal body with semi-axes a, b and c the average normal stress components 
are 


2 
Oy= (pa —2np’GAy) 5 


2 
c 
G, = (—2np’GA,) - (7) 
where G is the gravitational constant. The A; terms are dimensionless coefficients 
that depend only on the shape of the body, and are A, = 0.57003, Ay = 0.60584, and 
A, = 0.82413 for the dynamically equivalent and equal-volume ellipsoid of (29075) 


1950 DA, which were determined from equation (4.6) of ref. 4. The Drucker- 
Prager failure criterion using the average stresses is given by 


g (8) + (% 2) +@.—2.)] sks 4%) +2))°  ) 


where k is the internal cohesion and s is the slope constant. The slope constant is 
determined from the angle of friction g using 


_ 2sing 
* V33— sing) ” 


An angle of friction consistent with lunar regolith of 40° (ref. 19) was assumed 
to calculate the minimum internal cohesive force required to prevent structural 
failure of (29075) 1950 DA using the Drucker-Prager criterion. The minimum in- 
ternal cohesive force as a function of bulk density for three different angles of fric- 
tion is shown in Fig. 4. 
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Extended Data Figure 1 | Example ATPM fit to the WISE thermal-infrared 
observations. This fit (lines for WISE infrared bands W3 and W4) was made 
for a thermal inertia of 24Jm~*K~'s ' anda surface roughness of 50%. 

The error bars correspond to the 1o uncertainties on the measured data points. 
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Extended Data Figure 2 | WISE thermal-infrared images of pixels that do not contain data. The object seen to the upper left of 
(29075) 1950 DA. The image scale is 2.75 arcsec per pixel for the W1, W2and (29075) 1950 DA (red circle) in the W1 (3.4 um) and W2 (4.6 um) bands is a 
W3 bands and 5.53 arcsec per pixel for the W4 band. White pixels are ‘bad’ faint background star (green circle). 
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Extended Data Figure 3 | Physical properties derived for (29075) 1950 DA 
as a function of thermal inertia. a, Diameter; b, bulk density. The dashed 
lines represent the 1o uncertainty for the average solid lines. The red horizontal 
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lines represent the radar diameter constraint® of 1.30 + 0.13 km, and the red 
vertical lines represent the corresponding thermal inertia constraint of 
=82Jm 7K 's 17, 
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Extended Data Figure 4 | Enhancement of Yarkovsky orbital drift by 
surface roughness for (29075) 1950 DA. 
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Extended Data Table 1 | WISE thermal-infrared observations of (29075) 1950 DA 


. {7 . . rn 
MJD* (day) Rotation phase ic le WISE magnitude wea Resteor rin ace SAU) Phase angle (°) 
55389.71982 0.000 W3 10.00 + 0.16 7.63 + 1.20 1.738 1.410 35.8 
55389.85212 0.497 W3 9.64 +0.12 10.64 + 1.29 1.739 1.409 35.8 
55390.11670 0.490 W3 9.52 +0.12 11.89 + 1.43 1.741 1.408 35.7 
55390.18282 0.238 W3 9.93 +0.15 SA7 S119 1.741 1.407 35.7 
55390.24890 0.985 W3 10.10 +0.19 6.98 + 1.26 1.741 1.407 35.7 
55390.24903 0.987 W3 9.90 + 0.16 8.86 + 1.27 1.741 1.407 35.7 
55390.31510 0.734 W3 9.76 +0.13 9.51 + 1.27 1.742 1.407 35.7 
55390.38121 0.482 W3 9.86 +0.15 8.69 + 1.25 1.742 1.406 35.7 
55390.51351 0.978 W3 9.92 +0.15 8.23 + 1.22 1.743 1.406 35.7 
55390.57973 0.727 W3 9.55 +0.11 11.54 + 1.30 1.744 1.405 35.7 
55390.71200 0.224 W3 9.82 +0.15 8.98 + 1.28 1.745 1.405 35.6 
55390.84433 0.721 W3 9.52 +0.11 11.87 + 1.34 1.745 1.404 35.6 
55390.97650 0.216 W3 9.98 + 0.17 7.75 + 1.28 1.746 1.403 35.6 
55390.97660 0.217 W3 9.79 +0.14 9.27 + 1.25 1.746 1.403 35.6 
55390.57973 0.727 W4 7.16 + 0.30 6.85.4 1.76 1.744 1.405 35.7 
55390.84433 0.721 W4 7.09 + 0.29 6.80 + 1.85 1.745 1.404 35.6 


* MJD is Modified Julian Day, that is, MJD = JD-2400000.5. 
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Extended Data Table 2 


Physical properties of (29075) 1950 DA 


Property Value 
Size Diameter of equivalent volume sphere” : 1.30 + 0.13 km 
Dimensions of dynamically-equivalent and equal-volume ellipsoid (2a, 2b, 2c) 1.46 x 1.39 x 1.07 km 
Absolute magnitude® 16.8402 
Optical Phase parameter® 0.15 +0.10 
Geometric albedo® 0.20 + 0.05 
Rotation Rotation period® 2.12160 + 0.00004 hr 
Obliquity’ 168+5° 
Semimajor axis" 1.70 AU 
Cie Eccentricity'* 0.51 


Yarkovsky semimajor axis drift'* 


(-2.95 + 0.57) x 10 AU/Myr 
(or -44.1+8.5m yr’) 


Surface composition 


Spectral type™ 


Thermal inertia* 


24 +20) J m2 kK" gv? 
(36 *9°/.59 J m* K's” at 1 AU) 


Surface roughness* 50 +30 % 
Radar albedo® 0.23 + 0.05 
Radar circular polarization ratio® 0.14 + 0.03 
Bulk density* 1.740.7gcm® 
Mass Macro-porosity* 51+19% 
Mass* (2.1 + 1.1) x 10" kg 
Surface area of negative ambient gravity* 48 +24% 
Cohesion Peak negative ambient gravity* (841) x 10° ge 
Internal cohesive strength* 64 *1*/ 59 Pa 


* Derived in this work. 
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Formation of monatomic metallic glasses through 


ultrafast liquid quenching 


Li Zhong’, Jiangwei Wang’, Hongwei Sheng”, Ze Zhang* & Scott X. Mao! 


It has long been conjectured that any metallic liquid can be vitrified 
into a glassy state provided that the cooling rate is sufficiently high’. 
Experimentally, however, vitrification of single-element metallic liquids 
is notoriously difficult’. True laboratory demonstration of the for- 
mation of monatomic metallic glass has been lacking. Here we report 
an experimental approach to the vitrification of monatomic metallic 
liquids by achieving an unprecedentedly high liquid-quenching rate 
of 10'*K s_'. Under sucha high cooling rate, melts of pure refractory 
body-centred cubic (bcc) metals, such as liquid tantalum and vana- 
dium, are successfully vitrified to form metallic glasses suitable for 
property interrogations. Combining in situ transmission electron 
microscopy observation and atoms-to-continuum modelling, we inves- 
tigated the formation condition and thermal stability of the mon- 
atomic metallic glasses as obtained. The availability of monatomic 
metallic glasses, being the simplest glass formers, offers unique pos- 
sibilities for studying the structure and property relationships of 
glasses. Our technique also shows great control over the reversible 
vitrification-crystallization processes, suggesting its potential in 
micro-electromechanical applications. The ultrahigh cooling rate, 
approaching the highest liquid-quenching rate attainable in the 
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experiment, makes it possible to explore the fast kinetics and struc- 
tural behaviour of supercooled metallic liquids within the nanosec- 
ond to picosecond regimes. 

Since the first discovery of metallic glass (MG) in the 1960s (ref. 1), 
the search for new types of MG has not stopped”*”. To date, most MG 
formers are known to consist of two or more elements with distinct atomic 
sizes and chemical affinities”*, usually formed by quenching the liquids 
with techniques varying from conventional die casting® (10'-10° Ks‘), 
melt spinning? (10°-10° Ks _'), liquid splat-quenching? (~10°-10'° Ks ') 
to pulsed laser quenching!° (~10'7-10'* Ks_'). New techniques such 
as nanocalorimetry"! (10*-10° K s_') have also emerged, vying for high 
heating and cooling rates. Unfortunately, these solidification techniques 
can hardly be applied to the production of monatomic MGs, mainly 
because of the extremely low glass-forming ability of monatomic metal- 
lic liquids, resulting from their fast nucleation and crystal growth kinet- 
ics at deep undercoolings*’*"*. Thus, vitrification of pure monatomic 
MGs requires extremely high critical cooling rates, far above the exper- 
imentally accessible level, to suppress crystal growth. The monatomic 
MG may also be confronted with the thermal stability issue at room 
temperature, at which spontaneous crystallization seems inevitable’*. 


Figure 1 | Illustration of an ultrafast liquid- 
quenching approach. a-c, Schematic drawing of 
the experimental configuration. Two protruded 
nano-tips are brought into contact with each other 
(a) and are melted by the application of a short 
square electric pulse with a duration of ~3.7 ns 
and a voltage in the range 0.5-3 V (b). Heat 
dissipates rapidly through the two bulk substrates 
(indicated by two red arrows), vitrifying the 
melting zone to form monatomic MGs 

(c). d, e, High-resolution TEM images showing two 
contacting Ta nano-tips (d) forming a Ta MG 

(e) after the application of a 0.8-V, 3.6-ns electric 
pulse. The GCIs are denoted by yellow dotted 
curves. f-h, Fast Fourier transformations 
confirming a fully vitrified region 20 nm long and 
15 nm thick (g) bounded by two crystalline 
substrates viewed along the (100) (f) and (110) 
(h) crystallographic orientations, respectively. 
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Figure 2 | Structure and thermal stability of Ta MG. a, TEM morphology 
of a typical Ta MG with length of ~90 nm and diameter of ~60 nm. The GCIs 
are indicated by yellow dotted curves. b, Electron diffraction of Ta MG, as 
quenched (left) and after relaxation for ~8 h (right). c, Comparison of the 
structure factors of the Ta MGs as formed (pink line), after relaxation for 8h 
(green line) and simulated (orange circles). All three curves show very similar 
peak positions, including well-separated second (q2) and third (q3) peaks 
(indicated by cyan arrows). The ratios of peak positions are the same for the 
relaxed and simulated structures: q2/q, = 1.68 and q3/q, = 1.99. 


Consequently, except for a few special circumstances (for example at 
very thin edges of a splat-quenched nickel foil’), monatomic MGs have 
not been found to form from pure metal melts by vitrification. 

More recently, pure metallic germanium liquid was reported to vitrify 
under hydrostatic pressure above 7.9 GPa (ref. 5). However, on releasing 
pressure to ambient condition, germanium MG quickly transforms to a 
non-metallic low-density amorphous phase, in which case the tendency 
and mechanism of liquid vitrification are largely different from those of 
most d-block transition metals. Other non-vitrification methods (such 
as vapour deposition’® and chemical synthesis’*) have been attempted 
to produce monatomic amorphous samples, which are often in geomet- 
rically confined forms (such as substrate-supported thin films and nano- 
sized powders) and are plagued with either purity’® or stability’ problems, 
offering limited potential for broader applications. Therefore advanced 
techniques to fabricate high-purity monatomic MGs with controllable 
geometries are highly appealing. By building an in situ Joule heating nano- 
device inside a transmission electron microscope (TEM), we present 
a unique ultrafast liquid-quenching system for vitrifying monatomic 
metallic liquids. This technique exploits the excellent thermal conduc- 
tivity of the metals and maximizes the heat conduction rate of the cool- 
ing system. 

Our ultrafast quenching technique is illustrated in Fig. la—c (see 
Methods). First, two protruded nano-tips with clean surfaces (Extended 
Data Fig. 1a, b) are brought into contact with each other (Fig. 1a) under 
an ultrahigh vacuum condition inside the TEM. A short square elec- 
tric pulse, typically 0.5-3 V in amplitude and within 3.7 ns in duration, 
imposes local Joule heating on the joined tips, causing melting of the 
extrusion tips and the formation ofa melting zone in the middle (Fig. 1b). 
On instantaneous cessation of the electric pulse and, consequently, local 
Joule heating, heat dissipates rapidly through the solidifying piece and 
the conductive heat reservoir, creating an extremely high cooling rate suf- 
ficient to vitrify the melt into a glassy state (Fig. 1c). In Fig. 1d, e we de- 
monstrate that a 0.8-V, 3.6-ns electric pulse on two connecting crystalline 
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Figure 3 | Dynamic vitrification process in liquid Ta revealed by AtC 
computer simulation. a, Atomic configuration showing a liquid zone of Ta 
35 nm in length after Joule heating (at t = 0 ps). The atoms are coloured on the 
basis of their degree of disorder represented by local bond-orientational order 
parameter qg (ref. 29). The red colour corresponds to liquid Ta after Joule 
heating. b, Atomic configuration showing the formation of a Ta MG segment 
30 nm in length after quenching (t = 150 ps). The average temperature of the 
Ta nanowire is close to room temperature at t = 150 ps. The inset highlights the 
interface structure between amorphous and bcc Ta. ¢, A time-temperature- 
transformation diagram derived from isothermal MD simulations, outlining 
approximately the formation condition of Ta MG. The crystal zone is estimated 
from the crystal growth rates of the (100) plane (cyan circles) and the (110) 
plane (orange squares). The red solid line indicates the temperature evolution 
of the moving LCI (and later on GCI) during cooling. 


Ta nano-tips (Fig. 1d) led to the formation of a Ta MG 15 nm wide and 
20 nm long (Fig. le). Structural characterization of the MG is presented 
below. The dimensions of the MGs can be controlled by tuning electric 
pulse parameters while engaging in situ tensile or compressive loading. 
In this way, Ta MGs with dimensions of 100 nm in diameter or an aspect 
ratio of ~4 are obtainable (Extended Data Fig. 1c, d). The formation of 
even larger Ta MGs, which are not electron transparent, was not pur- 
sued in this work. Applying this method, we have systematically tested 
the vitrification capability of transition metals and successfully obtained 
Ta, V, W and Mo monatomic MGs; their morphologies are provided in 
Extended Data Figs 2-4. The materials systems that have been vitrified 
are typically early transition bcc metals with high melting points and 
excellent thermal conductivities. 
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Figure 4 | Reversible crystallization-vitrification phase changes of Ta MG. 
a, Formation of a Ta MG 40 nm thick and 50 nm long under a 3.6-ns, 1.26-V 
electric pulse. The two GCIs are indicated by yellow dotted curves and are 
labelled A and B, respectively. b, c, Controlled gradual crystallization under a 
series of pulses 3.6 ns in duration and 0.90 V in amplitude. Crystallization 
proceeded with crystal growth at GCI B (indicated by a yellow arrow) and 


The amorphous nature of the MG as obtained was confirmed by TEM 
diffraction patterns. As a first check, the diffusive diffraction halos in the 
fast Fourier transformation (Fig. 1g) of the area bounded by two glass— 
crystal interfaces (GCIs) are characteristic of amorphous structure, con- 
trasting the bright diffraction spots of the Ta substrates with a well-defined 
bec structure (Fig. 1f, h). To confirm the glassy structure and the ther- 
mal stability of Ta MG, a sample 60 nm in diameter and 90 nm in length 
was relaxed in high vacuum at room temperature for 8 h (Fig. 2a). Elec- 
tron diffraction patterns of the Ta MGs as quenched (Fig. 2b, left) and 
after relaxation (Fig. 2b, right) showed similar features characterized by 
diffuse halos typical of amorphous structure. The corresponding integrated 
and optimized one-dimensional static structure factors’ S(q) (Fig. 2c) 
showed similarities in their shape and peak positions, indicating that 
no major structural changes occurred in Ta MG after 8h. The slight 
shift to the right in the main peak positions of S(q) may be attributed 
to structural relaxation in the glass, as expected. The main peak posi- 
tions of the relaxed Ta MG were measured to be 2.63 A71,4.42 A~'and 
5.23 A_', corresponding to q2/q, = 1.68 and q3/q; = 1.99, which are 
almost identical to the simulated structure factor (orange circles in Fig. 2c) 
derived by quenching liquid Ta at a cooling rate of ~10'* Ks! on the 
computer. The observed S(q) of Ta MG also agrees well with theoretical 
works on monatomic systems’””, as well as with previous experimental 
results on amorphous cobalt” and iron”, for which qz/q, = 1.69 and q3/ 
qi = 1.97. 

To understand the vitrification process of the liquid and estimate the 
cooling rate, atoms-to-continuum (AtC) simulations” have been per- 
formed, where the molecular dynamics (MD) system is coupled to an 
additional electron temperature field that implements the two-temperature 
model (TTM)” for heat transport (see Methods). The present experi- 
mental approach is capable of vitrifying monatomic metallic liquids on 
temporal and spatial scales commensurate with those in MD modelling, 
permitting a direct comparison between experiment and MD simula- 
tion and enabling accurate interpretation of the multiphysics of the cool- 
ing process. Quenching of liquid Ta starts at the moment when external 
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Sixth crystallization pulse 


completed after six crystallization pulses (inset in c). d, A second vitrification 
pulse resulted in the formation of a Ta MG similar to that shown in a. 

e, f, Close-up views of the atomically rough and diffuse GCIs during a 
phase-change cycle. A schematic drawing with cyan dotted lines along one set 
of the (110) planes shows the gradual breakdown of the long-range order across 
the GCI. 


Joule heating is turned off (Extended Data Fig. 5a), during which the 
temperature evolution in the nanowire depends on rapid heat dissipa- 
tion through the massive crystalline substrates kept at room temperature. 
As a result of the large temperature gradient, excellent heat conductivity 
and small specimen size, ultrafast cooling is achieved, as demonstrated 
by the evolution of the temperature distribution in the Ta nanowire 
(Extended Data Fig. 5b). The computed cooling rate of the liquid zone 
(Extended Data Fig. 5c) reaches as high as 10'*Ks_' at 4,200 K and 
decreases slightly to 5 X 10'* Ks’ ' at the glass transition temperature 
T, of liquid Ta, which is estimated to be ~1,650 K (Extended Data Fig. 6). 

Accompanying the rapid quenching process, the real-time dynamics 
of the atomic system was revealed by MD simulations, indicating that 
whether a MG can eventually form is determined by the competition 
between the liquid-quenching rate and the crystal growth rate from the 
melt. Under the given experimental condition, a large portion of the 
original liquid Ta zone 35 nm in length was vitrified after Joule heating 
was cut off (Fig. 3a, b), demarcated by atomically rough GCIs (Fig. 3b 
inset), corroborating our experimental observations. More generally, the 
time needed for complete crystallization due to crystal growth at the (100) 
(cyan circles) and the (110) (orange squares) interfaces of the pre-existing 
crystals (that is, the crystalline substrates) at different temperatures is 
computed and plotted in the time-temperature-transformation dia- 
gram for Ta (Fig. 3c), based on the growth rates of these two liquid- 
crystal interfaces (LCIs) from the melt (Extended Data Fig. 5d). In the 
AtC simulation shown in Fig. 3a, b, the temperature evolution of the 
LCI (and later on the GCI) of the system follows the red curve in Fig. 3c, 
which trends into the glass-forming region (that is, the left side of the 
time-temperature-transformation curves) and corresponds to a quench- 
ing rate one order of magnitude higher than the critical cooling rate of 
~5X10'*Ks | estimated from the dimensional consideration (described 
in Methods), justifying the formation of Ta MG under the present exper- 
imental configuration. The effect of a trailing edge in the applied electric 
pulses is taken into account by modelling liquid quenching under a heat 
flux terminated within 0.4 ns in a ramp function rather than instantly. 
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As shown in Extended Data Fig. 7a, b, 18 nm of the 35-nm Ta liquid 
was successfully vitrified into MG under a cooling rate varying between 
3X10 and 10'*Ks | (Extended Data Fig. 7c). 

The thermal stability of Ta MG is rationalized by our computation, 
showing that the crystal growth of low-index faces of bcc Ta is a thermally 
activated process at room temperature, with infinitesimally small growth 
rates at the GCIs (based on Extended Data Fig. 5d). It should be pointed 
out that the ‘slow’ growth rate of Ta crystals is distinctly different from 
that of face-centred cubic (fcc) metals, in which the growth of crystal inter- 
faces is expected to be spontaneous and fast even at zero temperature”. 
Indeed, we have tried but failed to produce any monatomic MGs from 
fcc metals (for example gold, silver, copper, palladium, aluminium, rho- 
dium and iridium) using the very same approach. 

The competition between vitrification and crystal growth can be con- 
trolled experimentally, leading to a novel phase-change phenomenon in 
the MG. Figure 4 illustrates a vitrification—crystallization cycle ina Ta 
sample controlled by alternately applying two kinds of electric pulse with 
the same duration (3.6 ns) but different voltages. With the assistance of 
in situ TEM observation, the structural and morphological evolutions 
of the sample can be monitored on the fly. As shown in Fig. 4a,a Ta MG 
40 nm in thickness and 50 nm in length obtained with a high-voltage 
(1.26-V) electric pulse (that is, the vitrification pulse) reverted to its original 
crystalline state after the application of a series of low-voltage (0.90-V) 
electric pulses (that is, the crystallization pulses), with each pulse reduc- 
ing the size of the sandwiched amorphous zone (Fig. 4b, c). The GCIs 
were identified as being atomically rough and diffuse during both vit- 
rification (Fig. 4e) and crystallization (Fig. 4f). After complete crystal- 
lization of the Ta MG (Fig. 4c and inset), another vitrification pulse again 
generated a glassy zone of Ta (Fig. 4d) with almost identical dimensions 
and morphology to that shown in Fig. 4a, demonstrating that a revers- 
ible glass—crystal phase-change process is achievable by our approach. 
Another example of controlled phase changes in Ta MG is presented 
in Supplementary Video 1. This reversible phase-change behaviour of 
Ta, bearing a resemblance to those in chalcogenide glasses**”, indicates 
the potential of the present methodology for employing marginal glass 
formers with extremely fast crystallization kinetics for applications in 
phase-change-based nano-devices”*”’. 

The vitrification of pure metallic liquids reported here should not be 
attributed to the enhanced glass-forming ability associated with impu- 
rities in the original materials (Extended Data Table 1) and/or oxygen 
contamination during experimental procedures (see Methods and Ex- 
tended Data Fig. 8). The successful formation of monatomic MGs opens 
up new opportunities for studying the structural dependence of rheol- 
ogical, thermal, electrical and mechanical properties of MGs, in which 
complications due to chemical effects in multicomponent MGs can be 
shielded. For instance, we have conducted tensile tests on the monatomic 
Ta MGas synthesized (Extended Data Fig. 9 and Supplementary Video 2). 
The insight gained from the mechanical testing experiment is beyond 
the scope of the present paper and will be presented later. Last, we stress 
that the ultrafast quenching rate (~10'* Ks" ') achieved in this technique 
is high enough to freeze the atoms within a fraction of a nanosecond. 
With such a high cooling rate to reach deep quench, the inherent struc- 
ture of liquids”* can be accessed, enabling investigations to be made of 
the fast kinetics of supercooled liquids and the mechanisms for the for- 
mation of metastable materials under conditions far from equilibrium. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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The tidal-rotational shape of the Moon and evidence 


for polar wander 


Jan Garrick -Bethell!, Viranga Perera‘t, Francis Nimmo! & Maria T. Zuber? 


The origin of the Moon’s large-scale topography is important for 
understanding lunar geology’, lunar orbital evolution’ and the Moon’s 
orientation in the sky’. Previous hypotheses for its origin have included 
late accretion events’, large impacts’, tidal effects® and convection 
processes’. However, testing these hypotheses and quantifying the 
Moon’s topography is complicated by the large basins that have formed 
since the crust crystallized. Here we estimate the large-scale lunar 
topography and gravity spherical harmonics outside these basins and 
show that the bulk of the spherical harmonic degree-2 topography 
is consistent with a crust-building process controlled by early tidal 
heating throughout the Moon. The remainder of the degree-2 topo- 
graphy is consistent with a frozen tidal-rotational bulge that formed 
later, at a semi-major axis of about 32 Earth radii. The probability of 
the degree-2 shape having both tidal-heating and frozen shape char- 
acteristics by chance is less than 1%. We also infer that internal den- 
sity contrasts eventually reoriented the Moon’s polar axis by 36 + 4°, 
to the configuration we observe today. Together, these results link 
the geology of the near and far sides, and resolve long-standing ques- 
tions about the Moon’s large-scale shape, gravity and history of polar 
wander. 

The theory of equilibrium figures of rotating fluid bodies is a classic 
problem in geophysics, and it has been helpful in understanding the 
shapes of the Sun and planets. However, the origin of the Moon’s shape 
has remained an open problem in the past century~** ”°, and the body’s 
deviations from any simple tidal-rotational (spherical harmonic degree-2) 
figure are large"’. This difficulty is surprising given the Moon’s presum- 
ably simple early thermal history: born hot and quickly cooled, one might 
expect the Moon to be described by a simple figure of equilibrium. 

Researchers have traditionally suggested that the Moon’s degree-2 
spherical harmonic gravity coefficients, which have been used as proxies 
for the degree-2 shape, are especially large when compared to higher- 
degree coefficients”'’. Figure 1 shows a power law or ‘Kaula’s rule’ fit to 
degrees n = 3 to 50 for the Moon’s gravity’* and topography data’*. The 
power at degree 2 is 4.5 times and 2.6 times the power expected from 
extrapolating the best-fit power law, for gravity and topography, respec- 
tively, supporting the idea that the degree-2 coefficients are unique. 
Indeed, the fraction of excess power for topography is greater than the 
excesses for Venus, Earth and Mars (Supplementary Information). The 
Moon’s strong degree-2 power has been interpreted as a frozen tidal- 
rotational state inherited from when the Moon was closer to the Earth; 
this is known as the fossil bulge hypothesis®. An open problem, however, 
has been that the ratio of the C29 and C; » spherical harmonic coefficients 
is different from the expected value by a factor of 2.6 (refs 2 and 10). 

Adding to the fossil bulge idea and motivated by tidal processes in 
Europa’s ice shell’’, Garrick-Bethell et al.’* inferred that the farside high- 
lands crust has a degree-2 shape that is explainable by tidal heating 
during the magma ocean epoch. However, ref. 16 did not address the 
rest of the Moon’s shape, the Moon’s orientation history, and the details 
of gravity and topography when they are examined together. In par- 
ticular, ref. 16 did not reconcile its results with the classic fossil bulge 


hypothesis”®, or explain its anomalous C; o/C),z ratio. To address these 
problems and create a unified explanation for the Moon’s degree-2 shape 
and orientation, we here consider two effects: the Moon’s largest basins, 
and the reference frame in which we analyse lunar topography. 

The South Pole—Aitken basin (SPA) is the largest’’, deepest’ and oldest 
lunar basin’, and its degree-2 power affects our interpretation of the pri- 
mordial degree-2 shape. In addition to SPA, we focus on the 12 largest 
basins that produce obvious local anomalies in topography, crustal thick- 
ness or gravity (in all, 22% of the surface, Fig. 2a—c). To determine the 
Moon’s degree-2 shape without these basins, we fit spherical harmonics 
of degrees n = 0 to 5 to data outside their boundaries. Figure 2d and f 
shows the Moon’s topography and appearance after rotation to the ref- 
erence frame where the only non-zero degree-2 terms are Cy and C,, 
(with C,9 <0), hereafter termed ‘the principal frame’. If the Moon’s 
outer figure, as opposed to its internal density distribution, once con- 
trolled the lunar moments of inertia (see below), this would be the ref- 
erence frame that once faced Earth. This frame’s largest principal axis 
is at (6 + 4°S, 30 + 1°B), its polar axis is at (54 + 5° N, 309 + 6°E), 
and its intermediate axis is at (35.1 + 5°S, 296.4 + 4° E) (Fig. 2a-c). 

Without the largest basins, the Moon’s topography power spectrum 
displays substantially less variance at low degrees. Performing a power- 
law fit for n = 3 to 50 using the new power at degrees 3, 4 and 5 (Fig. 1, 
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Figure 1 | Lunar topography and gravity power spectra, with best-fit power 
laws for degrees n = 3 to 50. The blue dots show the power using data outside 
large basins (+10). The blue dot at degree-1 for gravity is due to a small 
displacement of the lunar centre of mass when the basins are removed. See 
Supplementary Information section 1. 
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Global: Cy, = -0.67 km, Cz = 0.11 km 


Nearside 


Farside 


Global: C, . = -9.08 x 10°, Cy. = 3.47 x 10° 
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Figure 2 | The topography, gravity and appearance of the Moon, with black 
lines illustrating basins removed in the analysis. a, Lunar topography. The 
crossed black circle, diamond and square in a, b and ¢ are the primordial 
minimum, intermediate and maximum principal moment of inertia axes, 
respectively. b, Expansion of degree-2 to degree-360 lunar gravity potential 
coefficients (multiply by 2.823 X 10°m*s * to obtain the surface potential). 


dashed red line), we find the degree-3 and degree-4 power is much closer 
to the predictions from Kaula’s rule. However, the degree-2 power remains 
in excess by a factor of 2.8. The Moon’s strong degree-2 power, even with- 
out its large basins, implies that purely local explanations for the degree-2 
character of the far side, such as a late-accreting second moon’, are less 
plausible. 

To address the origin of the Moon’s primordial degree-2 shape, we 
must also consider the degee-2 gravity potential of the Moon (Fig. 2b). If 
we again fit degree-2 coefficients outside the basins, we find that gravity’s 
largest principal axis shifts only 5 + 2°, from (0° N, 180° E) to (5 + 2°S, 
182 + 1° E), and its polar axis only 5 + 2° to (85 + 2°N, 203 + 35° E) 
(Supplementary Table 7). In addition, the degree-2 gravity power decreases 
by a small amount, 12% (Fig. 1, blue dot). The weak effect of basins on 
the degree-2 gravity potential is partly due to SPA’s nearly compensated 
state'’, and SPA’s large contribution (45%) to the area removed. 

The gravity and topography principal frame calculations above reveal 
a previously unappreciated but critical problem in understanding the 
lunar shape: while both gravity and topography have anomalously high 
degree-2 power, the principal topography and gravity reference frames 
do not align at present (that is, using global data), and nor do they align 
when using degree-2 harmonics fitted outside the largest basins. Using 
global data, the largest gravity and topography principal axes are sepa- 
rated by 34°, and using data outside large basins, the largest principal axes 
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c, Lunar 750-nm spectral reflectance, with the data above 75° latitude blacked 
out. d, The data in a after rotation to the topography principal frame, using 
rotation angles calculated from data outside large basins. e, The data in b after 
rotating to the topography principal frame, as in d. f, The data in c after rotation 
to the principal topography frame, as in d. 


are separated by 30° + 5°. Therefore, other non-basin events distorted 
the Moon from any single, simple equilibrium figure in either gravity 
or topography, making it unclear which data set represents the primor- 
dial frame where any tidal-rotational effects were acquired. 

However, a simple argument suggests that topography’s principal frame 
formed first. Degree-2, tidally produced crustal thickness variations’®, 
if they exist, must have developed early when the lithosphere was weak 
enough to permit significant tidal flexing, and will therefore be isosta- 
tically compensated (with a relatively small gravity signature). Further- 
more, any uncompensated fossil component of shape, if it exists, must 
have frozen-in after the lithosphere cooled and strengthened, and degree-2 
crustal thickness growth largely ceased. Therefore, as long as the crustal 
thickness variations produced degree-2 topography that dominated any 
subsequent fossil topography, and the principal axes remained mostly 
fixed while forming, topography’s principal frame will be the Moon’s 
first-established Earth-oriented principal frame. Below, we will demon- 
strate that topography components from both crustal-thickening (com- 
pensated) and fossil-bulge (uncompensated) processes probably exist in 
topography’s principal frame, with the crustal component being larger, 
and that each topography component has the C,o/C ,. ratio expected 
from each unique process. 

To assess the nature of the degree-2 topography in the primordial, 
basin-removed principal topography frame, we examine the associated 
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Table 1 | Compensated and uncompensated degree-2 topography 
harmonics 


Compensated topography (+1) Uncompensated topography (+10) 


C20 —0.53 + 0.07 km —0.11 + 0.04 km 
Coo 0.40 + 0.06 km 0.11 + 0.03 km 
C20/C2,2 —13 40.2 —10+03 


Solution for the combination of compensated and uncompensated topography to match the C29 and 
C22 gravity and topography harmonics shown in Fig. 2d and e. Compensated and uncompensated 
topography are associated with crustal thickness variations and a frozen fossil bulge, respectively. The 
solution assumes a crustal density 2,550 kgm~°, mantle density 3,200 kgm °, mean lunar density 
3,340 kgm 3, and a mean crustal thickness of 40 km (ref. 20). C29 values do not sum exactly to 
—0.65 km because of rounding. 


gravity harmonics in the same frame (Fig. 2e). In this frame, we use a 
joint analysis of gravity and topography to find that neither completely 
compensated nor completely uncompensated topography alone can 
explain the C29 and C,, gravity coefficients (Supplementary Informa- 
tion section 4). However, in Table 1 we show that a linear combination of 
compensated and uncompensated topography is consistent with gravity 
and topography observations; the topography is effectively about 80% 
compensated (shown graphically in Supplementary Fig. 9). 

Having established that both fossil (uncompensated) and crustal thick- 
ness (compensated) topography components are required, we can exam- 
ine their coefficient ratios to test their origins. The ratio of Cz,9/C2,2 for 
normalized gravity and topography coefficients is —0.96 (which is approx- 
imately — 1.0) for frozen tidal—rotational fossil bulges in low-eccentricity 
synchronous orbits (and assuming that the normalized polar moment 
of inertia is 0.4)'®'*!. The classic problem has been that the observed 
present-frame ratio is very different from —1.0: it is —2.6 for global 
gravity” (Fig. 2b) and —6.1 for global topography (Fig. 2a). However, 
we must now also consider the expected topography ratio for tidally con- 
trolled crustal thickness variations’®. Unlike the case for fossil topography, 
this ratio is variable, depending on the amount of tidal dissipation. Al- 
though dissipation depends on a number of parameters that are diffi- 
cult to estimate, such as lower crustal viscosity, we find that for 114 
model calculations spanning a variety of conditions, C;,o/C,,. approaches 
—1.1 to —1.3 as the mean global tidal heat flux increases above about 
50mW m * (Fig. 3). 

From Table 1, we see the ratio C2 /Cz,2 for compensated topography 
in topography’s principal frame is — 1.3 + 0.2, and for uncompensated 
topography, the ratio is — 1.0 + 0.3. These values are consistent with the 
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Figure 3 | The ratio Cy,o/C,,2 for crustal thickness (or compensated 
topography), as a function of global mean tidal heat flux, for 114 model 
cases. (See Supplementary Table 13.) The observed ratio of —1.3 + 0.2 
(lo, dashed lines) for compensated topography outside of large basins is 
illustrated (Table 1). The inset shows a model crustal thickness map with 
Cyo/Ca,2 = —1.26 (Supplementary Information section 8). 
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ratios predicted for a crust sculpted by tidal heating, and a frozen fossil 
bulge, respectively. A similar spherical harmonic coefficient fit to a model 
of crustal thickness”, with large basins removed, yields C3,9/C2 = —1.1 
+ 0.2 (Supplementary Fig. 10), in good agreement with the compen- 
sated topography ratio. The observed topography C2z9/C2,2 ratios are 
robust (compared to their uncertainties) to the inclusion or exclusion 
of different basins, as well as increases in the size of SPA up to 50% (30% 
for other basins), and changes in the maximum fit degree (Supplementary 
Table 4). If we had not removed the effects of large basins, the solution for 
compensated and uncompensated topography in the global-topography 
principal frame yields Cz9/Cz,2 values of —2.0 and —6.1, respectively 
(Supplementary Table 10). 

Toassess the likelihood that the unique Cz,9/C2, ratios arise by chance, 
we performed Monte Carlo simulations with topography and gravity 
with the same statistical properties as the observed data. This topography 
and gravity could arise from any source, including early mantle convec- 
tion processes, or the process that produced the Moon’s centre-of-mass/ 
centre-of-figure offset. We find that the probabilities of the compen- 
sated and uncompensated topography C2,o/C2,2 ratios randomly falling 
between — 1.1 to — 1.3, and between —0.9 to — 1.1 (ranges taken to rep- 
resent the predicted values for each mechanism), are 8% and 5%, respec- 
tively (Supplementary Information section 9 and Supplementary Figs 12 
and 13). The joint probability is only 0.3%, suggesting that the degree-2 
shape is tidally produced. 

In the principal topography frame, we also obtain gravity terms S, ), 
C,,, and S,», which constitute 59% of the basin-removed degree-2 gravity 
power (Fig. 2e). Since these terms are associated with zero topography, 
they arise from subsurface density anomalies that must have developed 
after a rigid lithosphere formed. Dynamically produced hemisphere- 
scale density changes have been proposed”’™’, and these would prob- 
ably have degree-2 power that could have affected the Moon’s degree-2 
tidal signatures. We can estimate the probability that the Moon’s tidal 
characteristics would survive such changes. For example, starting with 
just the C,,o and C, 2 gravity and topography values for the basin-removed 
Moon, a randomly placed hemisphere-sized gravity anomaly that yields 
the same total degree-2 gravity power as the basin-removed Moon per- 
mits survival (<30% alteration) of the compensated topography C2,9/Cz,2 
ratio 92% of the time, and survival of the uncompensated ratio 37% of 
the time (Supplementary Information section 10). This simple model 
demonstrates that if the Moon’s unique tidal signatures form (which is 
seldom by chance; see above), their recovery is quite plausible despite 
subsequent internal gravity changes. This is largely because the C>,o/ 
C,,2 ratios are dependent on topography, not gravity alone. 

Our tidal calculations indicate that when the semi-major axis a exceeds 
about 25 Earth radii (Rg), no realistic models can produce significant 
tidal heating, and when a is less than about 10Rg, the orbital evolution 
timescales (less than a million years) are too short to have built a signi- 
ficant amount of crust. The uncompensated Cz and C;,2 values imply 
fossil freeze-in at a ~ 32R, or a ~ 30Rg allowing for 18% relaxation 
after four billion years™. This freeze-in location is larger than 25Rg (above), 
and therefore consistent with the requirement that the lithosphere must 
have formed after the crust-building epoch. The location is also consis- 
tent with freeze-in before the Cassini state transition (a ~ 30-34R,)”, 
which would have affected the lunar shape. Nominally, it takes roughly 
200-300 million years for the Moon to evolve to a ~ 32Rg after accretion”. 
This lithosphere development timescale is consistent with estimates of 
100-200 million years for complete magma ocean crystallization, based 
on radioisotope studies and thermal modelling” ”’. By combining time- 
scales such as these, and our inferred fossil formation position at a ~ 32R,, 
the orbital evolution and tidal properties of the early Earth-Moon system 
can be further constrained. 

Finally, we find the principal topography frame places the Moon’s pala- 
eopole in northern Oceanus Procellarum (54 + 5° N, 309 + 6° E), and 
about 30° from the centre of the thorium-rich Procellarum KREEP (en- 
riched in potassium, calcium and the rare-earth elements) terrane (Sup- 
plementary Fig. 14). This palaeopole location may be testable by using 
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the poles of magnetized portions of the crust’. Eventually, the addi- 
tional gravity in C1, S. and S,, plus the basins we have removed, 
changed the lunar moments of inertia, and reoriented the Moon to the 
present frame we see today. While the details and timing of these later 
processes are not yet fully understood, a self-consistent origin of the 
primordial degree-2 shape helps to provide a framework for under- 
standing the many subsequent events in lunar evolution. 
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A comprehensive account of the causes of alcohol misuse must ac- 
commodate individual differences in biology, psychology and envi- 
ronment, and must disentangle cause and effect. Animal models’ 
can demonstrate the effects of neurotoxic substances; however, they 
provide limited insight into the psycho-social and higher cognitive 
factors involved in the initiation of substance use and progression 
to misuse. One can search for pre-existing risk factors by testing for 
endophenotypic biomarkers’ in non-using relatives; however, these 
relatives may have personality or neural resilience factors that pro- 
tect them from developing dependence’. A longitudinal study has 
potential to identify predictors of adolescent substance misuse, par- 
ticularly if it can incorporate a wide range of potential causal factors, 
both proximal and distal, and their influence on numerous social, 
psychological and biological mechanisms*. Here we apply machine 
learning to a wide range of data from a large sample of adolescents 
(n = 692) to generate models of current and future adolescent alco- 
hol misuse that incorporate brain structure and function, individual 
personality and cognitive differences, environmental factors (includ- 
ing gestational cigarette and alcohol exposure), life experiences, and 
candidate genes. These models were accurate and generalized to novel 
data, and point to life experiences, neurobiological differences and 
personality as important antecedents of binge drinking. By identi- 
fying the vulnerability factors underlying individual differences in 
alcohol misuse, these models shed light on the aetiology of alcohol 
misuse and suggest targets for prevention. 

Alcohol misuse is common among adolescents’: slightly over 40% of 
all 13-14-year-old adolescents in the USA report alcohol use and 10% 
of this age group exhibit regular use. These figures rise to almost 65% 
for any alcohol use and 27% who report regular use by age 16 years. This 
is of concern as murine models demonstrate that adolescents are more 
vulnerable to alcohol-induced neurotoxicity than adults’. Early alcohol 
use is a strong risk factor for adult alcohol dependence’ and therefore 
identifying inter-individual vulnerabilities and predictors of alcohol use 
in human adolescents is of importance. Generating such predictors, how- 
ever, is challenging, not least because large sample sizes are needed to 


provide accurate estimates of the small effect sizes that prevail in the 
biological sciences”*. Therefore, previous prospective studies, which 
typically focus on just one type of risk factor, have necessarily yielded 
modest predictions of future alcohol misuse. Moreover, previous clas- 
sification approaches incorporating biological data have often been 
flawed due to overfitting?"®"’. 

Personality measures, particularly those assessing traits conferring 
risk for substance misuse, can identify adolescents at high risk of sub- 
stance misuse”. Life events in early adolescence, such as parental divorce’’, 
can also serve as predictors of future alcohol use. A number of candid- 
ate genes for alcohol dependence have been identified’, although the 
overall risk conveyed by any one polymorphism is small'*. Cognitive 
factors such as executive function (for example, inhibitory control), but 
not attention and visual memory, distinguished non-substance-using 
siblings of substance misusers from healthy controls'®. Response inhi- 
bition was a modest predictor of adolescent alcohol misuse (explaining 
about 1% of variance) in a large sample of adolescents!”. Until now, there 
have been no large-sample prospective studies examining the neural 
correlates of alcohol misuse, but there is some evidence of a reduction 
in brain activity during tests of inhibitory control for adolescents who 
subsequently engaged in heavy alcohol use’’. 

Here, we construct models of current and future adolescent binge 
drinking by combining a wide range of data (Extended Data Table 1) 
from the IMAGEN project’*”®, a multi-dimensional longitudinal study 
of adolescent development, using regularized logistic regression”' (Ex- 
tended Data Fig. 1). First (Analysis 1), we identified the characteristics 
discriminating 115 14-year-old binge drinkers (a minimum of three 
lifetime binge drinking episodes leading to drunkenness by age 14) from 
150 14-year-old controls (non-binge drinkers, a maximum of two life- 
time uses of alcohol until at least the age of 16; see Extended Data Table 2 
for participant details) returning an area-under-the-curve (AUC) receiver- 
operator characteristic (ROC) value of 0.96 (95% CI = 0.93-0.98; see 
Extended Data Table 3a for all beta weights). At the optimum point in 
the ROC curve, 91% of binge drinkers and 91% of non-binge drinkers 
were correctly classified, significantly better than chance (P = 8.0 X 10°’). 
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Atthe maximum F-score value, this classification accuracy corresponds 
to a precision rate of 87% (that is, those identified as binge drinkers 
who are actually binge drinkers) and a recall rate of 99% (that is, binge 
drinkers that are successfully detected; Extended Data Fig. 2a, b). 

The model reported in Analysis 1, although highly accurate, was 
dominated by the inclusion of smoking, which often co-occurs with 
alcohol use. In Analysis 2, therefore, we removed smoking and re-ran 
the analyses (see Extended Data for all additional analyses with smok- 
ing included), which resulted in an AUC of 0.90 (95% CI = 0.86-0.93). 
At the optimum point in the ROC curve, 82% of binge drinkers and 
89% of non-binge drinkers were correctly classified (P = 8.8 X 10 **). 
At the maximum F-score value the precision rate was 87% and the re- 
call rate was 89% (Extended Data Fig. 2e, f). The features included in 
this model, and their strength of association with group membership, 
are displayed in Fig. la. 

Figure 2a displays the brain regions that most consistently discrim- 
inated current binge drinkers from non-binge-drinkers (see Extended 
Data Fig. 3 for the contributions of each brain feature). The most robust 
brain classifiers were in ventromedial prefrontal cortex (vmPFC) and 
the left inferior frontal gyrus (IFG). The vmPFC grey matter volume 
was smaller in the current binge drinkers and this group, compared to 
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Figure 1 | The relationship between group membership and each feature 
that was present in at least 9 folds of the final model. Position on the 
horizontal represents the point-biserial correlation statistic (r) between each 
feature and group membership. Negative r values indicate that higher scores are 
associated with an increased likelihood to engage in binge drinking at 14. Error 
bars represent 95% confidence intervals (calculated using 10,000 bootstraps). 
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controls, also showed decreased activity when anticipating or receiv- 
ing a reward, but increased activity when processing angry faces. In the 
left IFG, current binge drinkers had smaller volumes and reduced 
activity when anticipating and receiving rewards and when processing 
angry faces. 

The performance of each domain on its own (Analysis 3), both with 
and without age-14 smoking, is displayed in Extended Data Fig. 4a. The 
History and Personality domains were each accurate classifiers (AUC > 
0.8). Next, we sought to quantify the unique contribution of each domain 
to the classification of current binge drinkers both with (Analysis 4) and 
without (Analysis 5) age-14 smoking. To this end, we iteratively removed 
each domain from the full model (re-calculating the optimum elastic 
net parameters), and observed the relative reduction in classification 
accuracy (Extended Data Fig. 4b, c). The History domain contributed 
the greatest unique variance to the model (significant correlations among 
features are displayed in Extended Data Fig. 5). The results of external 
generalizations of the current binge drinking models with and without 
nicotine (Analyses 6 and 7, respectively) are displayed in Extended Data 
Fig. 2c, d, g, h. 

We have described the profile of current alcohol misusers while also 
demonstrating the efficacy of our modelling approach. However, to 


Prediction 


b 


History 

Romantic events 
1-2 alcohol uses by age 14 ——»—___ 
Deviance valence 


i 

1 

f 

f 

1 

1 

i 

f 

1 

1 

1 

' 

: : 1 

Deviance history ! 

1 

Family valence 1 

Gestational alcohol exposure 
Family hx of drug misuse 


——-—_. 
v 
—-+—__— 
r 
—— 
v 
—_—— 


Personality 
Conscientiousness 
Extravagance 
Excitability 
Disorganization 
Hopelessness 
Anxiety sensitivity 
Sensation seeking 
Extraversion 
Neuroticism 


Genetics 
1s10758821T 
1s2369955C 
1s2140418T 
rs10893366T 


Brain 

Parenchymal volume 
Reward outcome 
GMV:WMV 

Failed inhibition 
Regional GMV 
Emotional reactivity 
Reward anticipation 


Cognitive 

Foil recognition 
SWM strategy 
SWM errors 

RT variability 
RT mean 
Performance IQ 
Verbal IQ 


AGN positive omissions 


Demographics 
Age 
Handedness 


es 
—+—___— 
— 
—_—_.—_— 
———$s—— 


—.+__ 


—-__ 


——_--__. 


i 
1 
1 
' 
1 
1 
1 
1 
' 
1 
' 
1 
1 
' 
1 
1 
i 
1 
ooo 
' 
1 
D 


n 
—+—__ 
v 


ne) 


— 


———+——_ 


fi 
1 
1 
1 
i 
1 
' 
1 
1 
' 
1 
' 

—+___ 
1 
1 

——_e_ 
' 
1 
—_._—_ 

' 
-—___+___. 
r 


' 
rs 
1 


+1 


————>—— 


1 
—_>+—— 
1 


——__+—_—_—, 


—— 


——+—__ 


i 
———e— 


—L__.__. 


——+ 1 


-0.50 -0.25 0.00 0.25 0.50 
Correlation coefficient 


a, Analyses 1 and 2, the classification of binge drinking at age 14 years 

(n = 265). b, Analysis 8 predicting binge drinking at age 16 years (n = 271). 
AGN, affective go/no go; hx, history; SURPS, substance use risk profile scale; 
SWM, spatial working memory; GMV, grey matter volume; WMV, white 
matter volume. 
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Figure 2 | Brain regions associated with binge drinking and the relative 
contribution of each brain metric to the classification. The average beta 
weight for each brain metric (normalized to sum to 1 and averaged over the ten 
outer folds). Error bars depict standard errors of the mean across the folds. 

a, b, Brain regions that classify binge drinking at age 14, Analyses 1 and 2 


identify risk factors for adolescent alcohol misuse, a matter of clinical 
relevance, a model that predicts future binge drinking is required. Thus, 
in Analysis 8, we compared 121 future binge drinkers (a maximum of 
two drink occasions by age 14 and a minimum of three lifetime binge 
drinking episodes by age 16) to the 150 controls described previously. 
This model had an AUC of 0.75 (95% CI = 0.69-0.80; Extended Data 
Fig. 2i, j). At the optimum point in the AUC curve, 73% of non-binge 
drinkers and 66% of binge drinkers were correctly classified, signifi- 
cantly better than chance (P = 4.2 X 1017) given a base rate of 45% 
binge drinkers. This corresponds to a precision rate of 64% and a recall 
rate of 93% at the maximum F-score value. The features of the final 
model are displayed in Fig. 1b. Figure 2b displays the brain regions that 
discriminated future binge drinkers from non-binge-drinkers and the 
contributions of each functional/structural feature are displayed in Ex- 
tended Data Fig. 6. 

Next, we examined each domain on its own (Analysis 9). History 
was still the most predictive domain; however, now its influence was 
broadly comparable to Brain and Personality (Extended Data Fig. 4d), 
although the unique contribution of History was more apparent when 
each domain was iteratively removed from the model (Analysis 10; Ex- 
tended Data Fig. 4e). Significant correlations among the features are 
displayed in Extended Data Fig. 7. 

Our profile of adolescent binge drinking used a large sample and was 
internally valid, in that it generalized well using cross-validation. How- 
ever, an outstanding question is whether or not this profile would be 
applicable to a new sample with different levels of alcohol consumption, 
which would speak to the dimensional nature of substance misuse”. 
Thus, we applied the prediction model from Analysis 8 toa new sample 
from the IMAGEN study (Analysis 11): all subjects had between 3-5 
lifetime drink occasions (that is, a score of 2 on the substance misuse 
questionnaire) but no binge drinking episodes by age 14; by age 16, 61 
of these still had no binge-drinking episodes whereas 55 participants 
had at least 3 binge-drinking episodes. Application of the model (with- 
out age-14 drinking as this was the same for all participants) resulted in 
similar predictability to that reported above: ROC AUC = 0.75 (95% 
CI = 0.66-0.83). At the optimal point of the AUC 77% of binge drinkers 
and 67% of non-binge-drinkers were correctly assigned (P = 2.71 X 10°). 
At the maximum F-score value, this corresponds to a precision rate of 
65% and a recall rate of 93%. The most robust brain predictors of future 
binge drinking were the right middle and precentral gyri (Brodmann 
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(n = 265). The most robust brain classifiers were in ventromedial prefrontal 
cortex (a) and the left inferior frontal gyrus (b). c, d, Brain regions that predict 
binge drinking at age 16, Analysis 8 (n = 271). The most robust brain predictors 
of future binge drinking were the right precentral gyrus (c) and bilateral 
superior frontal gyrus (d). 


Area 6) and bilateral superior frontal gyrus (Brodmann Area 9). At age 
14 future binge drinkers had reduced grey matter volume but increased 
activity when receiving a reward in the superior frontal gyrus com- 
pared to controls. In premotor cortex, future binge drinkers showed 
greater grey matter volume and greater activity when failing to inhibit. 

A number of features were common to both current and future al- 
cohol misuse (Analyses 2 and 8). Life events, such as a romantic or sex- 
ual relationship, were strong classifiers for both current and future binge 
drinkers. Personality measures associated with binge drinking included 
the novelty-seeking trait from the temperament and character invent- 
ory (TCI) psychobiological model of personality”’. This trait identifies 
the behaviour of searching for, and feeling rewarded by, novel experi- 
ences and is regarded as a heritable, dopamine-related temperament: 
higher scores on Disorderliness and Extravagance (a tendency to ap- 
proach reward cues) characterized both current and future binge drin- 
kers. Conscientiousness (the degree to which an individual is organized, 
controlled and motivated to achieve a desired goal) was lower in both 
current and future binge drinkers. 

Some features differed in their utility to classify current and future 
binge drinkers. Disruptive family events, the personality trait of agree- 
ableness, more developed pubertal status, impulsivity and higher delay 
discounting (the tendency to devalue future rewards) classified current, 
but not future, binge drinkers. In contrast, the anxiety sensitivity sub- 
scale of the substance use risk profile scale (SURPS)** (fear of anxiety- 
related emotions and sensations due to beliefs that these emotions and 
sensations could lead to harmful consequences) predicted non-binge 
drinking at age 16, not at age 14. Parenchymal volume and grey:white 
matter ratio predicted future, but not current binge drinking. The most 
prominent brain regions for classifying current binge drinkers included 
the vmPFC and the left lateral PFC, areas that have been implicated in 
emotional regulation of bingeing behaviour”*”*. Whereas emotional 
processing areas were implicated in age-14 binge drinking, predicting 
age-16 binge drinkers from data at age 14 relied relatively more on re- 
gions associated with failed inhibitory control and reward outcome and 
on local and global brain structure. Notably, even 1-2 lifetime alcohol 
occasions by age 14 was sufficient to be an important predictor of future 
binge drinking at age 16. 

We have identified a generalizable risk profile for alcohol misuse in- 
itiation. In contrast with the classification of current binge drinkers, 
which was primarily a function of the History domain, the prediction 
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of future binge drinking relied relatively more on a combination of three 
domains: History, Personality and Brain (individual ROC AUCs of 0.68, 
0.67 and 0.63, respectively; Analysis 9). Thus, these results point to the 
value of a multi-domain analysis for predicting adolescent alcohol mis- 
use and speak to the multiple causal factors for alcohol misuse. Further, 
we note that the influence of any one feature in isolation was modest, con- 
sistent with data showing that effect sizes in previous studies with smal- 
ler samples are likely to have been overestimated’”. Given that the odds 
of adult alcohol dependence can be reduced by 10% for each year drink- 
ing onset is delayed in adolescence”, this risk profile may facilitate the 
development of targeted interventions”, which often yield higher ef- 
fect sizes than general approaches”. 


METHODS SUMMARY 


Informed consent was obtained from all subjects and their parents/guardians. We 
collected a wide range of data at age 14, which were arranged into the following 
domains: Brain, Personality, Cognition, History, Genetics and Demographics (Ex- 
tended Data Table 1). Substance misuse data were acquired at both ages 14 and 16. 
Functional brain activity was recorded during reward anticipation and outcome, 
successful and unsuccessful inhibitions on a test of motor inhibitory control, anda 
test of emotional reactivity to angry faces. Structural brain data consisted of regional 
grey matter volume, total parenchymal volume, and white:grey matter ratio. Per- 
sonality data included both broad personality traits and those specifically related 
to substance misuse. Cognitive measures assessed IQ, delay discounting, spatial 
working memory, attentional biases for affective stimuli and behavioural mea- 
sures from the functional imaging tasks. The History domain included life events, 
family history of alcohol and drug misuse and gestational alcohol and cigarette 
exposure. We assessed 15 candidate genes related to alcohol abuse", and demo- 
graphic features included sex, pubertal development status and socioeconomic 
status. To construct the models, a multistep procedure was used to create summary 
scores from brain images first, which were then combined with the other data (Ex- 
tended Data Fig. 1). Classification was conducted using logistic regression with 
elastic net regularization”, allowing the inclusion of correlated features in sparse 
model fits (that is, potentially selecting a subset of features). We report data from 
the test sets of a tenfold cross validation. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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A common Greenlandic TBC1D4 variant confers 
muscle insulin resistance and type 2 diabetes 
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The Greenlandic population, a small and historically isolated founder 
population comprising about 57,000 inhabitants, has experienced a 
dramatic increase in type 2 diabetes (T2D) prevalence during the past 
25 years'. Motivated by this, we performed association mapping of T2D- 
related quantitative traits in up to 2,575 Greenlandic individuals without 
known diabetes. Using array-based genotyping and exome sequencing, 
we discovered a nonsense p.Arg684Ter variant (in which arginine is 
replaced by a termination codon) in the gene TBC1D4 with an allele 
frequency of 17%. Here we show that homozygous carriers of this variant 
have markedly higher concentrations of plasma glucose (7= 3.8 mmol I~’, 
P=2.5 X 10 *) and serum insulin (#= 165 pmoll~', P= 1.5 x 10°”) 
2 hours after an oral glucose load compared with individuals with other 
genotypes (both non-carriers and heterozygous carriers). Furthermore, 
homozygous carriers have marginally lower concentrations of fasting 
plasma glucose (?= —0.18 mmol1~!, P= 1.1 x 10~°) and fasting serum 
insulin (f= —8.3 pmoll~', P= 0.0014), and their T2D risk is markedly 
increased (odds ratio (OR) = 10.3, P= 1.6 x 1074). Heterozygous 
carriers have a moderately higher plasma glucose concentration 2 hours 
after an oral glucose load than non-carriers (?= 0.43 mmoll™ 1 P= 
5.3 X 10°). Analyses of skeletal muscle biopsies showed lower mes- 
senger RNA and protein levels of the long isoform of TBC1D4, and 
lower muscle protein levels of the glucose transporter GLUT4, with 
increasing number of p.Arg684Ter alleles. These findings are con- 
comitant with a severely decreased insulin-stimulated glucose uptake 
in muscle, leading to postprandial hyperglycaemia, impaired glucose 
tolerance and T2D. The observed effect sizes are several times larger 
than any previous findings in large-scale genome-wide association 
studies of these traits” * and constitute further proof of the value of 
conducting genetic association studies outside the traditional set- 
ting of large homogeneous populations. 

Genetic association studies have traditionally been performed in large 
homogeneous populations. However, several studies have shown that it 
can be valuable to use founder populations’, and there are similar advan- 
tages to using small and historically isolated populations. These advan- 
tages include increased statistical power to detect associations, owing to 
extended linkage disequilibrium and to an increased probability that dele- 
terious variants overcome their selective disadvantage and reach high 
allele frequencies as a result of substantial genetic drift over many gen- 
erations. Therefore, we aimed at identifying genetic variants associated 
with glucose homeostasis in the Greenlandic population, which is a small 
and historically isolated founder population, by association mapping of 


3,17 


four T2D-related traits: plasma glucose and serum insulin levels at fast- 
ing and 2 h after an oral glucose load. 

We successfully genotyped 2,733 participants in the Inuit Health in 
Transition (IHIT) cohort®, sampled from 12 regions in Greenland (Fig. 1a), 
with the Illumina Cardio-Metabochip’ (Metabochip) (Extended Data 
Table 1). We observed a high degree of linkage disequilibrium compared 
with Europeans (Extended Data Fig. 1a) and a high degree of European 
admixture (Fig. 1b). Additionally, as a natural consequence of the fact 
that the cohort constitutes almost 5% of the population, we identified more 
than 1,000 close relationships (siblings or parent-offspring) (Extended 
Data Fig. 1b). Population structure can lead to both decreased statistical 
power and increased type II error rates*. To avoid the latter, we therefore 
performed association analyses using a linear mixed model, which takes 
both admixture and relatedness into account. Genomic control inflation 
factors showed no inflation (range, 0.937-0.995; Fig. 2a and Extended 
Data Fig. 2). 

Discovery analyses were performed for up to 2,575 IHIT participants 
without previously known T2D using an additive model. We found that 
the minor allele of rs7330796 was strongly associated with a higher 2-h 
plasma glucose level (P = 4.2 X 10 '”) and a higher 2-h serum insulin 
level (P = 6.4X 10") (Fig. 2a, band Extended Data Fig. 2). These associations 
were replicated in the B99 Greenlandic cohort? (P = 4.2 X 10° ° for 2-h 
plasma glucose and P = 2.7 X 10 ° for 2-h serum insulin). 


b 
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Figure 1 | Greenlandic study population. a, Sampling locations in 
Greenland. b, Estimated admixture proportions of Inuit and European 
ancestry. The admixture proportions were estimated assuming two source 
populations (K = 2). The estimates are both for the 2,733 individuals in 

the Greenlandic sample (IHIT), depicted to the left of the vertical line, and for 
50 Danes, to the right of the vertical line. 
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Figure 2 | Associations between 2-h plasma glucose levels and genotypes, as 
determined by Metabochip assay. Tests were performed using an additive 
linear mixed model in 2,540 individuals from the IHIT study who were not 
known to have T2D and for whom valid 2-h plasma glucose data were available. 
a, A quantile-quantile (QQ) plot of the observed —logo[P] values (y axis) 
versus the —log)o[P] values expected under the null hypothesis of no 
association (x axis). The red line shows x = y, and the black lines demarcate the 
95% confidence interval. The / value is the genomic control inflation factor. 
b, A Manhattan plot of the observed —log;[P] values. The dashed horizontal 
line indicates a 0.05 significance threshold after Bonferroni correction for 
multiple testing. The lowest P value is for rs7330796 on chromosome 13. 


The rs7330796 variant was selected for inclusion on the Metabochip 
because it was in the top 5,000 candidate single nucleotide polymorphisms 
(SNPs) for association with waist-to-hip ratio’ and it has not previously 
been reported to be associated with any of the four examined T2D-related 
traits or with T2D. The variant is located in intron 11 of TBC1D4 and is 
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neither in high linkage disequilibrium with neighbouring variants on the 
Metabochip nor situated inside a long range linkage disequilibrium 
block (Extended Data Fig. 3a, b). To locate the causal variation in the 
region, we performed exome sequencing of nine trios, and we identified 
four coding SNPs in high linkage disequilibrium (7° > 0.8) with 1s7330796 
(Extended Data Table 2). We genotyped these SNPs and found that 
p-Arg684Ter, a nonsense polymorphism in TBC1D4 (c.2050C>T, 
1861736969), was strongly associated with 2-h plasma glucose levels (P = 
3.6 X 10 7°) in the IHIT cohort (Table 1 and Fig. 3a). Conditional ana- 
lyses demonstrated that p.Arg684Ter was significantly associated with 
2-h plasma glucose and 2-h serum insulin levels when conditioning on 
187330796 (P= 1.3 X 10 °andP=89X10 ”, respectively), whereas 
rs7330796 was not associated with the two traits when conditioning on 
p-Arg684Ter (P = 0.47 and P = 0.09) (Fig. 3a). Additionally, the mean 
2-h plasma glucose levels for individuals with two copies of the minor 
rs7330796 allele increased with increasing Inuit admixture propor- 
tion (Extended Data Fig. 4a), which is expected ifa variant is not caus- 
ative and if the linkage disequilibrium patterns differ between Inuit 
and Europeans. By contrast, the same was not true for p.Arg684Ter (Ex- 
tended Data Fig. 4b). These findings suggest that p.Arg684Ter is the 
causative variant. 

The mean 2-h plasma glucose levels stratified by p.Arg684Ter genotype 
suggested that the variant mainly has an effect in homozygous carriers, 
indicating a recessive inheritance (Fig. 3b). We therefore also performed 
analyses using a recessive model: that is, we compared homozygous 
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Figure 3 | Effect of the p.Arg684Ter nonsense polymorphism in TBC1D4. 
a, Association test results for all tested Metabochip SNPs in a 2-megabase (Mb) 
region around the p.Arg684Ter polymorphism (shown as a dashed vertical 
line). Each SNP is represented by a coloured circle. The position of the circle 
on the x axis shows the genomic position of the SNP. The position of the circle 
on the left y axis shows the —log;o[P] value of the SNP when testing for 
association with 2-h plasma glucose levels, as determined using an additive 
model. The colour of the circle indicates the extent of correlation (77) between 
the SNP and p.Arg684Ter. The circles representing p.Arg684Ter and 
rs7330796 are labelled (to the left of the circles). For every SNP, except for 
p-.Arg684Ter, there is also a white diamond, which illustrates the P value 
obtained by testing for association conditional on p.Arg684Ter. The solid blue 
line illustrates the recombination rate from the Chinese HapMap (CHB) 
panel (in centimorgan (cM) per Mb, right y axis). The protein-coding genes in 


the genetic region are shown below the plot. b, The mean 2-h plasma glucose 
and the frequency of T2D for three genotypes (zero, one or two p.Arg684Ter 
alleles). Superimposed are the estimated effect sizes from the mixed 

model + s.e.m. c, The two predominant isoforms of the TBC1D4 gene 
illustrating which exons are transcribed: the long isoform, which has two 
additional exons (top), and the short isoform (bottom). Exons are depicted as 
boxes, and the location of the p.Arg684Ter polymorphism is indicated by 

a red arrow. d, The mRNA expression level of the long TBC1D4 isoform in 
skeletal muscle from nine Greenlandic individuals (measured in arbitrary 
units (a.u.)). The mean value for each genotype group is shown as a horizontal 
line. e, Abundance of the long TBC1D4 protein isoform in skeletal muscle 
from nine individuals as quantified from western blot (measured in a.u.). The 
mean value for each genotype group is shown as a horizontal line. 
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Table 1 | Association of TBC1D4 p.Arg684Ter with metabolic traits in the IHIT cohort 


Trait n Additive model Recessive model 

Bsa. (95% Cl) B P Bsa. (95% Cl) B P 
Fasting plasma glucose (mmol |" +) 2,546 —0.13 (—0.2 to —0.064) —0.048 0.00011 0.45 (—0.63 to —0.27) 0.18 1.1x10° 
2-h Plasma glucose (mmol |" +) 2,511 0.35 (0.29 to 0.42) 1.1 3.6x 10725 1.2 (0.99 to 1.4) 3.8 2.5x 10735 
Fasting serum insulin (pmol! +) 2,546 0.14 (—0.21 to —0.07) 23 0.00012 0.33 (—0.53 to —0.13) 8.3 0.0014 
2-h Serum insulin (pmol |” +) 2,511 0.29 (0.22 to 0.36) 57 6.7x 10777 0.90 (0.71 to 1.1) 160 1.5 x 107-2° 
Fasting serum C-peptide (pmol |~?) 2,546 0.12 (—0.19 to —0.049) 28 0.00092 0.32 (—0.52 to —0.13) 85 0.0012 
2-h Serum C-peptide (pmol |” +) 2,511 0.30 (0.24 to 0.36) 360 4.4x10-?° 0.82 (0.65 to 1) 1,000 85x10~2° 
HbAic (%) 2,692 0.015 (—0.047 to 0.078) 0.015 0.63 0.2 (0.036 to 0.37) 0.10 0.017 
HOMA-IR (mmol!~? x pmol!7?) 2,546 0.15 (—0.22 to —0.076) 0.078 64x10-° —0.36(—0.56to-0.16) -0.37 0.00047 
ISlo,120 2,487 0.32 (—0.39 to —0.26) 0.47 1.4x 10-2° —1.0 (-1.2 to —0.86) -14  16x10-27 
HOMA-B (%) 2,545 0.085 (—0.15 to -0.018) -2.3 0.013 —0.12 (-0.31 to 0.062) -4.2 0.19 
T2D (cases/controls) 220/1,810 0.083 (0.059 to 0.11) 0.083 2.1x 1071 0.37 (0.3 to 0.44) 0.37 1.6 x 10-24 
Fasting serum HDL-cholesterol (mmol|~+) 2,702 0.032 (—0.035 to 0.099) 0.019 0.34 0.098 (—0.082 to 0.28) 0.064 0.29 
Fasting serum total cholesterol (mmoll~+) 2,566 —0.013 (—0.081 to 0.056) -0.018 0.71 0.25 (0.064 to 0.44) 0.30 0.0086 
Fasting serum triglyceride (mmol |~?) 2,702 —0,040 (-—0.11 to 0.032) -0.020 0.27 0.022 (—0.17 to 0.22) 0.038 0.82 
BMI (kg m7?) 2,673 —0.036 (—0.11 to 0.036) -0.19 0.32 0.047 (—0.15 to 0.24) 0.25 0.63 


Results are shown for an additive and a recessive genetic model. For each trait, nis the number of individuals with genotype data for the specific variant and phenotype data for the specific trait. B..4. is the effect size 
estimated using quantile-transformed values of the trait (except for the binary trait T2D), and f is the effect size estimated using untransformed values. The P values were obtained from the quantile-transformed- 
value-based analyses. All P values <1 x 10~© are marked in bold and were all successfully replicated in the B99 cohort (Extended Data Table 3). Individuals with known T2D were removed from the analysis of 
quantitative traits. Highly significant associations were also observed when removing both individuals with known and screen-detected T2D from the analyses (data not shown). BMI, body mass index; HDL, 


high-density lipoprotein; HOMA-IR, homeostasis model assessment-estimated insulin resistance. 


carriers with carriers of other genotypes. These analyses demonstrated 
that homozygous carriers of p.Arg684Ter in the IHIT hada 3.8 mmol] 
higher 2-h plasma glucose level (Precessive model (rec) = 2.5 X 10° *°) (Table 1). 
Although the main effect was seen when comparing homozygous 
p-Arg684Ter carriers with all other individuals, even heterozygous car- 
riers displayed a 0.43 mmol * higher 2-h plasma glucose level than non- 
carriers (P = 5.3 X 10°) (Fig. 3b). To further investigate the metabolic 
implications of p.Arg684Ter, we analysed additional metabolic traits 
(Table 1). Analyses of 220 individuals with T2D and 1,810 non-diabetic 
control individuals showed a strong association of p.Arg684Ter with 
increased risk of T2D (Padditive model (add) = 2-1 X 107 11), Similar to the 
2-h plasma glucose levels, the data suggested a recessive inheritance for 
T2D (ORvec 10.3; Prec = 1.6 X 10°74) (Fig. 3b). Interestingly, when using 
an alternative definition of T2D that is based on recent HbAj¢ criteria 
and does not include plasma glucose data, the association was modest 
(P = 0.0084). This finding is in line with the modest association of 
p-Arg684Ter with HbAjc as a quantitative trait (Table 1). We also 
found that p.Arg684Ter was associated with decreased peripheral 
insulin sensitivity, as estimated by the Gutt insulin sensitivity index 
(ISI)!° (ISIo,120: Baad = — 0.32 8.d., Paga = 1.4 X 107°; Brec = —1.0.d., 
Proc = 1.6 X 10°”) (Table 1). We replicated these findings in the B99 cohort 
and found consistent results (Extended Data Table 3). Finally, we 
found associations of p.Arg684Ter with lower fasting plasma glucose 
and serum insulin levels in the IHIT cohort but with substantially lower 
effect sizes than the glucose-stimulated effects (Baaa = — 0.048 mmol 1™ 
Pada = 0.00011; Bree = —0.18 mmol 17’, Prec = 1.1 X 107° and Baga = 
—2.3 pmol] ', Paag = 0.00012; Brec = —8.3 pmoll™', Prec = 0.0014, re- 
spectively) (Table 1). Thus, our findings indicate that the p.Arg684Ter 
TBC1D4 variant confers increased risk of a subset of diabetes that fea- 
tures deterioration of postprandial glucose homeostasis. In this context, 
it is of interest that 2-h glucose levels appear to be a better predictor 
of cardiovascular disease than do fasting plasma glucose levels''. The 
p-Arg684Ter variant showed no significant association with HOMA-B, 
an estimate of basal B-cell function. Similarly, no convincing associa- 
tions were detected with measures of adiposity, fasting lipid levels or 
other components of metabolic syndrome (Table 1). 

The impact of p.Arg684Ter in its recessive form on 2-h plasma glucose 
(3.8 mmoll~') and T2D risk (OR, 10.3) are several times larger than any 
effects that have been reported in large-scale genome-wide association 
studies for these traits” *. Furthermore, the p.Arg684Ter polymorph- 
ism has a high population impact, as 3.8% of Greenlanders are homo- 
zygous carriers of the risk allele. In the IHIT cohort, 15.5% of the patients 
with T2D are homozygous carriers of the risk allele, in contrast to 1.6% 
among glucose-tolerant individuals, indicating that p.Arg684Ter accounts 
for more than 10% of all cases of T2D in Greenland. Between 40 and 60 
years of age, more than 60% of the homozygous carriers have T2D, and 
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this increases to more than 80% above the age of 60. The effect of 
p-Arg864Ter thus mirrors a Mendelian-disease-like pattern of inheritance. 

The p.Arg684Ter variant has a minor allele frequency (MAF) of 17% 
in the IHIT cohort, and we estimated it to have a MAF of 23% and 0% 
in the unobserved Inuit and European populations that are ancestral 
populations to the Greenlanders. In comparison, this variant was found 
in only 1 Japanese individual (NA18989) out of the 1,092 individuals 
sequenced in the 1000 Genomes Project”’, and it was not present in 
exome sequencing data from 2,000 Danish individuals’, 448 Han Chinese 
individuals or ~6,500 European and African American individuals". 
Thus, the variant is not unique to the Greenlandic population but is prob- 
ably common only among Greenlanders and other related populations. 
This finding raises the question of whether the variant has been favoured 
by natural selection or whether it has increased in frequency as a result of 
genetic drift. A test for selection showed weak evidence for positive selec- 
tion (Extended Data Fig. 5 and Supplementary Notes 1). 

TBC1D4, also known as AS160, acts as a mediator of insulin-stimulated 
Akt-induced glucose uptake through Rab-mediated regulation of GLUT4 
mobilization’’. Tbc1d4-knockout mice have decreased basal plasma glu- 
cose levels and are resistant to insulin-stimulated glucose uptake in muscle 
and adipose tissue’®. Furthermore, the overall GLUT4 levels in these mice 
are markedly lower than those of Tbc1d4-sufficient mice’®. Two isoforms 
of the TBC1D4 gene have been reported: one encodes a full-length protein, 
and the other encodes a short form lacking exons 11 and 12 (Fig. 3c). It is 
predicted that the p.Arg684Ter variant results in termination of TBC1D4 
transcription in exon 11; thus, this variant is expected to affect only the 
long isoform. In line with previous findings”, expression analyses in humans 
showed that while the short isoform of TBC1D4 is widely expressed, the 
long isoform is primarily expressed in skeletal muscle and not in other 
major tissues associated with glucose metabolism, such as adipose tissue, 
the liver or the pancreatic islets (Extended Data Fig. 6a, b). Thus, it is unlikely 
that p.Arg684Ter affects the latter tissues. We measured the expression levels 
in skeletal muscle tissue in groups of three individuals carrying zero, one 
or two copies of p.Arg684Ter. As predicted, the levels of long TBC1D4 
isoform mRNA and protein decreased with increasing number of p.Arg684Ter 
alleles (Fig. 3d, e). The short isoform was observed at very low levels in 
skeletal muscle regardless of the genotype (Extended Data Fig. 6c), 
indicating that this isoform is unlikely to contribute to the observed 
phenotype. Further analyses showed that GLUT4 protein levels in the 
skeletal muscle decreased with increasing number of p.Arg684Ter alleles 
(Extended Data Fig. 6d). Thus, the phenotype of global Thc1d4-knockout 
mice—lower fasting glucose levels and markedly lower insulin-stimulated 
glucose uptake than Tbc1d4-sufficient mice'’—is comparable to the 
phenotype observed in homozygous carriers of the p.Arg684Ter vari- 
ant. Our data indicate that disruption of the full-length TBC1D4 pro- 
tein in skeletal muscle results in severely decreased insulin-stimulated 
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glucose uptake, leading to postprandial hyperglycaemia, impaired glu- 
cose tolerance and T2D. 

The effect of TBC1D4 p.Arg684Ter is in line with a reported familial 
case of postprandial hyperinsulinaemia caused by a TBC1D4 p.Arg363Ter 
variant’*. However, the reported p.Arg363Ter variant affects both TBC1D4 
isoforms and consequently many tissues. Moreover, this variant has a 
large effect on insulin-stimulated glucose uptake in heterozygous car- 
riers but not on fasting glucose levels. By contrast, the p.Arg684Ter 
variant discovered here has a large effect only in homozygous carriers 
and is restricted to the long isoform of TBC1D4, thereby affecting 
TBC1D4 signalling in skeletal muscle but not in B-cells, the liver or 
adipose tissue. Furthermore, the p.Arg684Ter variant affects fasting 
glucose levels. Finally, the high frequency of the nonsense variant in 
Greenlanders has allowed us to assess the physiological impact of this 
variant with high statistical confidence. 

In summary, our study demonstrates the strength of conducting genetic 
association mapping outside the traditional setting of large homogeneous 
populations. We report a novel association of a common TBC1D4 
nonsense variant with T2D and elevated circulating glucose and insulin 
levels after an oral glucose load. The effect sizes of the variant markedly 
exceed previously reported associations of common genetic variants 
with metabolic traits. The variant leads to a prematurely terminated 
transcript of the long isoform of TBC1D4, which in homozygous car- 
riers causes insulin resistance in skeletal muscle and confers a high risk 
of a subtype of T2D that is characterized by a deterioration of post- 
prandial glucose homeostasis. 


METHODS SUMMARY 


Discovery association analyses were performed on the IHIT cohort data® and repli- 
cated with the B99 cohort data’ (Extended Data Table 1). Participants underwent an 
oral glucose tolerance test, with plasma glucose and serum insulin levels measured at 
fasting and after 2 h. Diabetes was classified according to the World Health Orga- 
nization. Samples were genotyped using the Cardio-Metabochip (Ilumina)’ with 
standard protocols. Samples with mis-specified gender, high rates of missing data 
and duplicates were removed using the software toolset PLINK. For association 
testing, we used a linear mixed model, implemented in the software GEMMA, to 
control for admixture and relatedness. Only participants without known diabetes 
were analysed (IHIT/B99: fasting plasma glucose, n = 2,575/n = 1,064; fasting serum 
insulin, n = 2,575/n = 1,062; 2-h plasma glucose, n = 2,540/n = 845; and 2-h serum 
insulin, n = 2,540/n = 845). All quantitative traits were quantile-transformed to 
a standard normal distribution; age and sex were included as covariates, and tests 
were performed using a likelihood ratio test. Effect sizes and their standard errors 
were estimated using a restricted maximum likelihood approach. Conditional 
analyses were performed by including the SNP that we conditioned on as an addi- 
tional covariate. A meta-analysis was performed using the inverse-variance-based 
method. To estimate the OR for T2D under the recessive model, we used a linear 
mixed model without covariates, estimated « (the intercept) and /} (the genotype 
effect) and set OR = {(a + f)/[1 — (a + f)]}/[a/(1 — «)]. After the discovery studies, 
we selected nine trios that we inferred to have no European ancestry, enriching for 
carriers of the rs7330796 minor allele. We exome-sequenced the trios by using 
SureSelect capture (Agilent) followed by HiSeq2000 sequencing (Illumina), aligned 
the data using bwa software and called genotypes using SAMtools software. Variants 
that were in high linkage disequilibrium with rs7330796 were identified, genotyped 
in all individuals and tested for association. Admixture proportions were estimated 
using the software ADMIXTURE, assuming that there are two ancestral populations. 
Relatedness was estimated using the software RelateAdmix, which takes admixture 
into account. Methods for the biological studies are described in Methods. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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As modern humans migrated out of Africa, they encountered many 
new environmental conditions, including greater temperature extremes, 
different pathogens and higher altitudes. These diverse environments 
are likely to have acted as agents of natural selection and to have led to 
local adaptations. One of the most celebrated examples in humans is 
the adaptation of Tibetans to the hypoxic environment of the high- 
altitude Tibetan plateau’ *. A hypoxia pathway gene, EPAS1, was pre- 
viously identified as having the most extreme signature of positive 
selection in Tibetans* '°, and was shown to be associated with differ- 
ences in haemoglobin concentration at high altitude. Re-sequencing 
the region around EPAS1 in 40 Tibetan and 40 Han individuals, we 
find that this gene has a highly unusual haplotype structure that can 
only be convincingly explained by introgression of DNA from Deni- 
sovan or Denisovan-related individuals into humans. Scanning a 
larger set of worldwide populations, we find that the selected haplo- 
type is only found in Denisovans and in Tibetans, and at very low 
frequency among Han Chinese. Furthermore, the length of the hap- 
lotype, and the fact that it is not found in any other populations, makes 
it unlikely that the haplotype sharing between Tibetans and Deni- 
sovans was caused by incomplete ancestral lineage sorting rather 
than introgression. Our findings illustrate that admixture with other 
hominin species has provided genetic variation that helped humans 
to adapt to new environments. 

The Tibetan plateau (at greater than 4,000 m) is inhospitable to human 
settlement because of low atmospheric oxygen pressure (~40% lower 
than at sea level), cold climate and limited resources (for example, sparse 
vegetation). Despite these extreme conditions, Tibetans have success- 
fully settled in the plateau, in part due to adaptations that confer lower 
infant mortality and higher fertility than acclimated women of low- 
altitude origin. The latter tend to have difficulty bearing children at high 
altitude, and their offspring typically have low birth weights compared to 
offspring from women of high-altitude ancestry'*. One well-documented 
pregnancy-related complication due to high altitude is the higher inci- 
dence of preeclampsia”"' (hypertension during pregnancy). In addition, 
the physiological response to low oxygen differs between Tibetans and 
individuals of low-altitude origin. For most individuals, acclimatiza- 
tion to low oxygen involves an increase in blood haemoglobin levels. 
However, in Tibetans, the increase in haemoglobin levels is limited’, 
presumably because high haemoglobin concentrations are associated 
with increased blood viscosity and increased risk of cardiac events, thus 
resulting in a net reduction in fitness’*”. 
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Recently, the genetic basis underlying adaptation to high altitude in 
Tibetans was elucidated*"° using exome and single nucleotide polymor- 
phism (SNP) array data. Several genes seem to be involved in the res- 
ponse but most studies identified EPAS1, a transcription factor induced 
under hypoxic conditions, as the gene with the strongest signal of Tibetan 
specific selection*’°. Furthermore, SNP variants in EPAS1 showed sig- 
nificant associations with haemoglobin levels in the expected direction 
in several of these studies; individuals carrying the derived allele have 
lower haemoglobin levels than individuals homozygous for the ances- 
tral allele. Here, we re-sequence the complete EPAS1 gene in 40 Tibetan 
and 40 Han individuals at more than 200X coverage to further char- 
acterize this impressive example of human adaptation. Remarkably, we 
find the source of adaptation was likely to be due to the introduction of 
genetic variants from archaic Denisovan-like individuals (individuals 
closely related to the Denisovan individual from the Altai Mountains") 
into the ancestral Tibetan gene pool. 

After applying standard next-generation sequencing filters (see Meth- 
ods), we call a total of 477 SNPs in a region of approximately 129 kilo- 
bases (kb) in the combined Han and Tibetan samples (Supplementary 
Tables 1 and 2). We compute the fixation index (Fs; see Methods) between 
Han and Tibetans, and confirm that it is highly elevated in the EPAS1 
region as expected under strong local selection (Extended Data Fig. 1). 
Indeed, by comparison to 26 populations from the Human Genome 
Diversity Panel’>”® (Fig. 1) it is clear that the variants in this region are 
far more differentiated than one would expect given the average genome- 
wide differentiation between Han and Tibetans (Fs; ~0.02, ref. 4). The 
only other genes with comparably large frequency differences between 
any closely related populations are the previously identified loci associ- 
ated with lighter skin pigmentation in Europeans, SLCA45A2 and HERC2 
(refs 17-20), although in these examples the populations compared (for 
example, Hazara and French, Brahui and Russians) are more genetically 
differentiated than Han and Tibetans. In populations as closely related 
as Han and Tibetans, we find no examples of SNPs with as much differen- 
tiation as seen in EPAS1, illustrating the uniqueness of its selection signal. 

Fer is particularly elevated in a 32.7-kb region containing the 32 most 
differentiated SNPs (green box in Extended Data Fig. 1 and Supplemen- 
tary Table 3), which is the best candidate region for the advantageous 
mutation(s). We therefore focus the subsequent analyses primarily on 
this region. Phasing the data (see Methods) to identify Han and Tibetan 
haplotypes in this region (Fig. 2), we find that Tibetans carry a high- 
frequency haplotype pattern that is strikingly different from both their 
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Figure 1 | Genome-wide Fs; versus maximal allele frequency difference. 
The relationship between genome-wide Fey (x axis) computed for each pair of 
the 26 populations and maximal allele frequency difference (y axis), first 
explored in ref. 19. Maximal allele frequency difference is defined as the largest 
frequency difference observed for any SNP between a population pair. The 26 
populations are from the Human Genome Diversity Panel (HGDP). The 
labels highlight genes that harbour SNPs previously identified as having strong 
local adaptation. The grey points represent the observed relationship between 
population differentiation (Fg; ) and maximal allele frequency difference; 

the more differentiated populations tend to have mutations with larger 
frequency differences. The star symbol and the yellow symbols represent 
outliers; these are populations that are not highly differentiated but where we 
find some mutations that have higher frequency differences than expected 
(light blue line). 


minority haplotypes and the common haplotype observed in Han Chinese 
For example, the region harbours a highly differentiated 5-SNP haplo- 
type motif (AGGAA) within a 2.5-kb window that is only seen in Tibetan 
samples and in none of the Han samples (the first five SNPs in Sup- 
plementary Table 3, and blue arrows in Fig. 2). The pattern of genetic 
variation within Tibetans appears even more unusual because none of 
the variants in the five-SNP motifis present in any of the minority hap- 
lotypes of Tibetans. Even when subject to a selective sweep, we would 
not generally expect a single haplotype to contain so many unique muta- 
tions not found on other haplotypes. 

We investigate whether a model of selection on either a de novo muta- 
tion (SDN) or selection on standing variation (SSV) could possibly lead 
to so many fixed differences between haplotype classes in such a short 
region within a single population. To do so, we simulate a 32.7-kb region 
under these models assuming different strengths of selection and con- 
ditioning on the current allele frequency in the sample (see Methods). 
We find that the observed number of fixed differences between the hap- 
lotype classes is significantly higher than what is expected by simula- 
tions under any of the models explored (Extended Data Fig. 2). Thus 
the degree of differentiation between haplotypes is significantly larger 
than expected from mutation, genetic drift and directional selection alone. 
In other words, it is unlikely (P < 0.02 under either a SSV scenario or 
under a SDN scenario) that the high degree of haplotype differentiation 
could be caused by a single beneficial mutation landing by chance ona 
background of rare SNPs, which are then brought to high frequency by 
selection. The remaining explanations are the presence of strong epistasis 
between many mutations, or that a divergent population introduced the 
haplotype into Tibetans by gene flow or through ancestral lineage sorting. 

We search for potential donor populations in two different data sets: 
the 1000 Genomes Project”’ and whole genome data from ref. 14. We 
originally defined the EPAS1 32.7-kb region boundaries by the level of 
observed differentiation between the Tibetans and Han only (Supplemen- 
tary Table 3, Extended Data Fig. 1 and Fig. 2) as described in the pre- 
vious section. In that region, the most common haplotype in Tibetans 
is tagged by the distinctive five-SNP motif (AGGAA; the first five SNPs 
in Fig. 2), not found in any of our 40 Han samples. We first focus on this 
five-SNP motif and determine whether it is unique to Tibetans or ifit is 
found in other populations. 

Intriguingly, when we examine the 1000 Genomes Project data set, we 
discover that the Tibetan five-SNP motif (AGGAA) is not present in any 
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Figure 2 | Haplotype pattern in a region defined by SNPs that are at high 
frequency in Tibetans and at low frequency in Han Chinese. Each column is 
a polymorphic genomic location (95 in total), each row is a phased haplotype 
(80 Han and 80 Tibetan haplotypes), and the coloured column on the left 
denotes the population identity of the individuals. Haplotypes of the Denisovan 
individual are shown in the top two rows (green). The black cells represent the 
presence of the derived allele and the grey space represents the presence of 
the ancestral allele (see Methods). The first and last columns correspond to the 
first and last positions in Supplementary Table 3, respectively. The red and 
blue arrows indicate the 32 sites in Supplementary Table 3. The blue arrows 
represent a five-SNP haplotype block defined by the first five SNPs in the 
32.7-kb region. Asterisks indicate sites at which Tibetans share a derived allele 
with the Denisovan individual. 


of these populations, except for a single CHS (Southern Han Chinese) 
and a single CHB (Beijing Han Chinese) individual. Extended Data Fig. 3 
contains the frequencies of all the haplotypes present in the fourteen 
1000 Genomes populations”' at these five SNP positions. Furthermore, 
when we examine the data set from ref. 14 containing both modern (Pap- 
uan, San, Yoruba, Mandeka, Mbuti, French, Sardinian, Han Dai, Dinka, 
Karitiana, and Utah residents of northern and western European ances- 
try (CEU)) and archaic (high-coverage Denisovan and low-coverage 
Croatian Neanderthal) human genomes”, we discover that the five-SNP 
motifis completely absent in all of their modern human population sam- 
ples (Supplementary Table 4). Therefore, apart from one CHS and one 
CHB individual, none of the other extant human populations sampled 
to date carry this five-SNP haplotype. Notably, the Denisovan haplo- 
type at these five sites (AGGAA) exactly matches the five-SNP Tibetan 
motif (Supplementary Table 4 and Extended Data Fig. 3). 

We observe the same pattern when focusing on the entire 32.7-kb 
region and not just the five-SNP motif. Twenty SNPs in this region have 
unusually high frequency differences of at least 0.65 between Tibetans 
and all the other populations from the 1000 Genomes Project (Extended 
Data Fig. 4). However, in Tibetans, 15 out of these 20 SNPs are identical 
to the Denisovan haplotype generating an overall pattern of high hap- 
lotype similarity between the selected Tibetan haplotype and the Deni- 
sovan haplotype (Supplementary Tables 5-7). Interestingly, five of these 
SNPs in the region are private SNPs shared between Tibetans and the 
Denisovan, but not shared with any other population worldwide, except 
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for two SNPs at low frequency in Han Chinese (Extended Data Fig. 4 
and Supplementary Table 7). 

If we consider all SNPs (not just the most differentiated) in the 32.7-kb 
region annotated in humans, to build a haplotype network” using the 
40 most common haplotypes, we observe a clear pattern in which the 
Tibetan haplotype is much closer to the Denisovan haplotype than any 
modern human haplotype (Fig. 3 and Extended Fig. 5a; see Extended 
Data Fig. 6a, b for haplotype networks constructed using other criteria). 
Furthermore, we find that the Tibetan haplotype is slightly more diver- 
gent from other modern human populations than the Denisovan haplo- 
type is, a pattern expected under introgression (see Methods and Extended 
Data Fig. 5b). Raw sequence divergence for all sites and all haplotypes 
shows a similar pattern (Extended Data Fig. 7). Moreover, the divergence 
between the common Tibetan haplotype and Han haplotypes is larger 
than expected for comparisons among modern humans, but well within 
the distribution expected from human-Denisovan comparisons (Extended 
Data Fig. 8). Notably, sequence divergence between the Tibetans’ most 
common haplotype and Denisovan is significantly lower (P = 0.0028) 
than expected from human-Denisovan comparisons (Extended Data 
Fig. 8). We also find that the number of pairwise differences between 
the common Tibetan haplotype and the Denisovan haplotype (n = 12) 
is compatible with the levels one would expect from mutation accu- 
mulation since the introgression event (see Methods for Extended Data 
Fig. 8). Finally, if we compute D (ref. 14) and S* (refs 23, 24), two statis- 
tics that have been designed to detect archaic introgression into mod- 
ern humans, we obtain significant values (D-statistic P< 0.001, and S* 
P=0.035) for the 32.7-kb region using multiple null models of no gene 
flow (see Methods, Supplementary Tables 8-10, and Extended Data 
Figs 9 and 10a). 

Thus, we conclude that the haplotype associated with altitude adapta- 
tion in Tibetans is likely to be a product of introgression from Denisovan 
or Denisovan-related populations. The only other possible explanation 
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Figure 3 | A haplotype network based on the number of pairwise differences 
between the 40 most common haplotypes. The haplotypes were defined from 
all the SNPs present in the combined 1000 Genomes and Tibetan samples: 
515 SNPs in total within the 32.7-kb EPASI region. The Denisovan haplotypes 
were added to the set of the common haplotypes. The R software package 
pegas”’ was used to generate the figure, using pairwise differences as distances. 
Each pie chart represents one unique haplotype, labelled with Roman numerals, 
and the radius of the pie chart is proportional to the log,(number of 
chromosomes with that haplotype) plus a minimum size so that it is easier 

to see the Denisovan haplotype. The sections in the pie provide the breakdown 
of the haplotype representation amongst populations. The width of the edges 
is proportional to the number of pairwise differences between the joined 
haplotypes; the thinnest edge represents a difference of one mutation. The 
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is ancestral lineage sorting. However, this explanation is very unlikely 
as it cannot explain the significant D and S* values and because it would 
require along haplotype to be maintained without recombination since 
the time of divergence between Denisovans and humans (estimated to 
be at least 200,000 years (ref. 14)). The chance of maintaining a 32.7-kb 
fragment in both lineages throughout 200,000 years is conservatively 
estimated at P = 0.0012 assuming a constant recombination of 2.3 X 
10” * per base pair (bp) per generation (see Methods). Furthermore, the 
haplotype would have to have been independently lost in all African and 
non-African populations, except for Tibetans and Han Chinese. 

We have re-sequenced the EPAS1 region and found that Tibetans har- 
bour a highly differentiated haplotype that is only found at very low fre- 
quency in the Han population among all the 1000 Genomes populations, 
and is otherwise only observed in a previously sequenced Denisovan 
individual". As the haplotype is observed in a single individual in both 
CHS and CHB samples, it suggests that it was introduced into humans 
before the separation of Han and Tibetan populations, but subject to 
selection in Tibetans after the Tibetan plateau was colonized. Alterna- 
tively, recent admixture from Tibetans to Hans may have introduced 
the haplotype to nearby Han populations outside Tibet. The CHS and 
CHB individuals carrying the five-SNP Tibetan—-Denisovan haplotype 
(Extended Data Fig. 3) show no evidence of being recent migrants from 
Tibet (see Methods and Extended Data Fig. 10b), suggesting that if the 
haplotype was carried from Tibet to China by migrants, this migration 
did not occur within the last few generations. 

Previous studies examining the genetic contributions of Denisovans 
to modern humans'*” suggest that Melanesians have a much larger Deni- 
sovan component than either Han or Mongolians, even though the latter 
populations are geographically much closer to the Altai mountains'*”. 
Interestingly, the putatively beneficial Denisovan EPAS1 haplotype is 
not observed in modern-day Melanesians or in the high-coverage Altai 
Neanderthal’® (Supplementary Table 4). Evidence has been found for 
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legend shows all the possible haplotypes among these populations. The 
numbers (1, 9, 35 and 40) next to an edge (the line connecting two haplotypes) 
in the bottom right are the number of pairwise differences between the 
corresponding haplotypes. We added an edge afterwards between the Tibetan 
haplotype XXXIII and its closest non-Denisovan haplotype (XXI) to indicate its 
divergence from the other modern human groups. Extended Data Fig. 5a 
contains all the pairwise differences between the haplotypes presented in this 
figure. ASW, African Americans from the south western United States; CEU, 
Utah residents with northern and western European ancestry; GBR, British; 
FIN, Finnish; JPT, Japanese; LWK, Luhya; CHS, southern Han Chinese; CHB, 
Han Chinese from Beijing; MXL, Mexican; PUR, Puerto Rican; CLM, 
Colombian; TSI, Toscani; YRI, Yoruban. Where there is only one line within a 
pie chart, this indicates that only one population contains the haplotype. 
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Denisovan admixture throughout southeast Asia (as well as in Melane- 
sians) based on a global analysis of SNP array data from 1,600 individuals 
from a diverse set of populations”, and this finding has been recently 
confirmed by ref. 26. Therefore, it appears that sufficient archaic admix- 
ture into populations near the Tibetan region occurred to explain the 
presence of this Denisovan haplotype outside Melanesia. Furthermore, 
the haplotype may have been maintained in some human populations, 
including Tibetans and their ancestors, through the action of natural 
selection. 

Recently, a few studies have supported the idea of adaptive introgres- 
sion from archaic humans to modern humans as having a role in the 
evolution of immunity-related genes (HLA (ref. 28) and STAT2 (ref. 29)) 
and in the evolution of skin pigmentation genes (BNC2 (refs 23, 30)). 
Our findings imply that one of the most clear-cut examples of human 
adaptation is likely to be due to a similar mechanism of gene flow from 
archaic hominins into modern humans. With our increased understand- 
ing that human evolution has involved a substantial amount of gene 
flow from various archaic species, we are now also starting to under- 
stand that adaptation to local environments may have been facilitated 
by gene flow from other hominins that may already have been adapted 
to those environments. 


METHODS SUMMARY 


DNA samples included in this work were extracted from peripheral blood of 41 
unrelated Tibetan individuals living at more than 4,300 m above sea level within 
the Himalayan Plateau, with informed consent. Tibetan identity was based on self- 
reported family ancestry. The individuals were from two villages of Dingri (4,300 m 
altitude) and Naqu (4,600 m altitude). These individuals are a subset of the 50 indi- 
viduals exome-sequenced analysed in ref. 4. Samples of 40 Han Chinese (CHB) are 
from the 1000 Genomes Project. A combined strategy of long-range PCR and next- 
generation sequencing was used to decipher the whole EPAS1 gene and its +30-kb 
flanking region. We designed 38 pairs of long-range PCR primers to amplify the 
region in 41 Tibetan and 40 Han individuals. PCR products from all individuals 
were fragmented and indexed, then sequenced to higher than 260-fold depth for 
each individual with the Illumina Hiseq2000 sequencer. The reads were aligned to 
the UCSC human reference genome (hg18) using the SOAPaligner. Genotypes of 
each individual at every genomic location of the EPAS1 gene and the flanking region 
were called by SOAPsnp. To make comparisons with the 40 Han easier, we only used 
40 Tibetan samples for this study. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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Replication stress is a potent driver of functional 
decline in ageing haematopoietic stem cells 
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Morgan E. Diolaiti° 
Ciaran G. Morrison® & Emmanuelle Passegué! 


Haematopoietic stem cells (HSCs) self-renew for life, thereby mak- 
ing them one of the few blood cells that truly age’”. Paradoxically, 
although HSCs numerically expand with age, their functional activ- 
ity declines over time, resulting in degraded blood production and 
impaired engraftment following transplantation’. While many drivers 
of HSC ageing have been proposed’, the reason why HSC function 
degrades with age remains unknown. Here we show that cycling old 
HSCs in mice have heightened levels of replication stress associated 
with cell cycle defects and chromosome gaps or breaks, which are due 
to decreased expression of mini-chromosome maintenance (MCM) 
helicase components and altered dynamics of DNA replication forks. 
Nonetheless, old HSCs survive replication unless confronted with 
a strong replication challenge, such as transplantation. Moreover, 
once old HSCs re-establish quiescence, residual replication stress 
on ribosomal DNA (rDNA) genes leads to the formation of nucleolar- 
associated yH2AX signals, which persist owing to ineffective H2AX 
dephosphorylation by mislocalized PP4c phosphatase rather than 
ongoing DNA damage. Persistent nucleolar yH2AX also acts as a his- 
tone modification marking the transcriptional silencing of rDNA 
genes and decreased ribosome biogenesis in quiescent old HSCs. Our 
results identify replication stress as a potent driver of functional 
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decline in old HSCs, and highlight the MCM DNA helicase as a poten- 
tial molecular target for rejuvenation therapies. 

Both human and mouse HSCs accumulate yH2AX signals with age*’ 
This is taken as direct evidence of DNA damage occurring in old HSCs, 
since phosphorylation of histone H2AX by ATM or ATR upon sensing 
of DNA breaks is one of the first steps in the canonical DNA damage 
response (DDR)*. The idea that DNA damage is a driver of HSC ageing 
is also supported by the age-related functional impairment observed in 
HSCs isolated from mice deficient in DNA repair pathway components’. 
Accumulation of DNA damage in old HSCs is an attractive hypothesis to 
explain the propensity of the ageing blood system to acquire mutations”, 
especially since quiescent HSCs are particularly vulnerable to genomic 
instability after DNA damage, owing to their preferential use of the error- 
prone non-homologous end joining (NHE)) repair pathway'’. However, 
it remains to be established what causes YH2AX accumulation with age, 
and how it contributes to the functional decline of old HSCs. 

To address these questions, we isolated HSCs as Lin / cKit™ /Scal*/ 
FIk2~ /CD48° /CD150* cells from the bone marrow of young (6-12 weeks) 
and old (22-30 months) wild-type C57BL/6 mice (Extended Data Fig. 1a). 
We confirmed the functional impairment of old HSCs compared with 
young HSCs, with the expected reduced engraftment, loss of lymphoid 
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a, b, Representative images (a) and quantification 
(b) of yYH2AX foci in young and old HSCs (yHSC 
and oHSC, respectively). c, d, Representative 
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foci and p-ATR activation; d, PAR detection and 
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potential and early onset of bone marrow failure or myeloid malignan- 
cies following transplantation (Extended Data Fig. 1b)*°. We also con- 
firmed that old HSCs contain more YH2AX signals than young HSCs 
(Fig. 1a, b and Extended Data Fig. 2a)°. However, we found no evidence 
of associated co-localization of DNA damage proteins by microscopy, 
or DNA fragmentation by poly- ADP-ribose (PAR) and TdT-mediated 
dUTP nick end labelling (TUNEL) staining (Fig. 1c, d and Extended 
Data Fig. 2b, c). We also performed alkaline comet assays to directly mea- 
sure the number of DNA breaks and, although both populations showed 
some very damaged outliers, no statistical difference in mean tail moment 
was observed between young and old HSCs (Fig. le and Extended Data 
Fig. 2d, e). Importantly, we tested the effect of 0.5 Gy of ionizing radi- 
ation on young HSCs, since this dose was estimated to be equivalent to 
the level of yYH2AX signals present in old HSCs°, and observed increased 
tail moment by comet assay and 53BP1/yH2AX co-localization, hence 
validating the sensitivity of our assays (Extended Data Fig. 2f, g). We also 
found that age-associated YH2AX signals were considerably less intense 
than ionizing-radiation-induced yH2AX foci (Extended Data Fig. 3a), 
which probably reflects differences in the spread and density of phos- 
phorylated H2AX in each case. Collectively, these results indicate that 
old HSCs display yH2AX signals without DDR activation or detect- 
able levels of DNA breaks. 

To determine whether old HSCs remain competent for DDR, we ex- 
posed young and old HSCs to 2 Gy of ionizing radiation and followed 
their kinetics of DNA repair by microscopy (Fig. 2a and Extended Data 
Fig. 3b). In both populations, we observed increased 53BP 1-containing 
YH2AX foci by 2 h after ionizing radiation, followed by their progressive 
disappearance over time. Although old HSCs showed slower kinetics, 
both populations had essentially cleared all ionizing-radiation-induced 
YH2AX foci by 24h after irradiation (Fig. 2b). In addition, both young 
and old HSCs expressed equivalent levels of homologous recombination 
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Figure 2 | Efficient DNA repair but persistence of replication stress 
remnants in old HSCs. a, b, Representative images (a) and quantification 
(b) of DNA repair kinetics in 2 Gy irradiated young and old HSCs (yHSC?°" 
and oHSC?*, respectively; n = 3). IR, ionizing radiation. c, GRT-PCR analyses 
of HR and NHEJ gene expression in young and old HSCs (n = 4). Results 
are expressed as fold change compared with young HSCs (set to 1). 

d, e, Representative images and mean fluorescence intensity (MFI) 
quantification of RPA (d) and ATRIP (e) staining in young and old HSCs. 

2 Gy irradiated cells are included as positive control. Scale bars, 10 jum. 

Data are means + s.d. ***P <= 0.001. 
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and NHEJ DNA repair genes by quantitative polymerase chain reac- 
tion with reverse transcription (qRT-PCR) analyses (Fig. 2c). Altogether, 
these results demonstrate that old HSCs can activate the DDR and clear 
ionizing-radiation-induced yH2AX foci as effectively as young HSCs, 
and suggest that accumulation of YH2AX in old HSCs could be inde- 
pendent of the sensing of DNA breaks. In fact, ATR can also be activated 
upon sensing interference with DNA replication forks®. Strikingly, we 
observed increased staining for the single-stranded DNA-binding pro- 
teins RPA and ATRIP in old HSCs (Fig. 2d, e and Extended Data Fig. 3c), 
which suggests that age-associated yH2AX signals could originate from 
replication stress’. 

Replication stress is intrinsically linked to cell proliferation, and pre- 
vious studies have reported a spectrum of findings ranging from increased, 
decreased, to unchanged proliferation in old HSCs”. In our hands, cell 
cycle analyses revealed a variable frequency of G0/GI1 cells in old HSCs 
(Fig. 3a and Extended Data Fig. 4a). RT-PCR analyses of cell cycle genes 
also indicated enhanced expression of Cdkn1a (p21) and decreased expres- 
sion of a range of cyclins in old HSCs, which suggest engagement of cell 
cycle restriction checkpoints (Extended Data Fig. 4b). Moreover, track- 
ing of single-cell division kinetics uncovered a consistent ~4 h delay in 
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Figure 3 | Replication stress in cycling old HSCs. a, Cell cycle distribution of 
young and old HSCs (yHSC and oHSC, respectively; n = 8). b, Single cell 
tracking to measure the kinetics of the first and second cell division in cultured 
young and old HSCs (n = 3). c, EdU and EdU/BrdU labelling of cycling young 
and old HSCs (n = 4). d, e, Representative images of YH2AX/p-CHK1 (d) 
and yH2AX/53BP1 (e) foci in cycling young and old HSCs. f, Representative 
images of YH2AX/EdU staining in 36h cycling young and old HSCs. 

g, h, Representative images of RPA staining (g) and persistent G1-phase 53BP1 
bodies (h) in 36h cycling young and old HSCs. i, Quantification of mean 

tail moment in 36h cycling young and old HSCs by alkaline comet assay 

(n = 4). Results are expressed as fold change compared with yHSCs (set to 1). 
j, Representative reverse image of DAPI-stained metaphase cell from 5-day- 
expanded old HSCs showing chromatid gaps (arrows). Scale bars, 10 jm. Data 
are means + s.d. *P = 0.05, **P=0.01, ***P = 0.001. NS, not significant. 
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the onset of the first division in old HSCs, which was even increased dur- 
ing the second division (Fig. 3b). We directly confirmed that old HSCs 
had both a delayed entry into S phase and an extended S phase using single 
5-ethynyl-2’-deoxyuridine (EdU) and double EdU/5-bromodeoxyuridine 
(BrdU) incorporation experiments (Fig. 3c and Extended Data Fig. 4c). 
Collectively, these results demonstrate impaired progression through S 
phase in cycling old HSCs. 

DNA replication is often accompanied by yYH2AX foci formation at 
stalled and/or collapsed replication forks, and activation of the DDR to 
allow normal DNA synthesis“. Strikingly, we found that cycling old HSCs 
displayed significantly more phosphorylated (p)-CHK1 and 53BP1- 
containing yH2AX foci than young HSCs, and directly showed YH2AX 
accumulation in replicating old HSCs using EdU/yH2AX co-staining 
of both in vitro and in vivo cycling cells (Fig. 3d-f and Extended Data 
Fig. 4c, d). We also confirmed elevated RPA staining in 36 h cycling old 
HSCs (Fig. 3g), and persistent 53BP1 bodies in ~60% of old HSCs that 
had re-entered G1 phase at this time point compared with ~20% of young 
HSCs (Fig. 3h and Extended Data Fig. 4e)'°. Moreover, we observed a 
trend towards elevated amounts of fragmented DNA detected by alkaline 
comet assays in cycling old HSCs either cultured in vitro (not significant) 
or isolated after in vivo mobilization treatment (Fig. 3i and Extended 
Data Fig. 4f, g). Consistently, 5-day-expanded cultures showed increased 
numbers of chromosomal gaps and breaks in the progeny of cycling 
old HSCs (Fig. 3j and Supplementary Table 1). In contrast, karyotyp- 
ing analyses revealed no evidence of chromosomal deletions and/or 
translocations (Supplementary Table 1), which usually occur as a con- 
sequence of NHEJ-mediated repair of DNA breaks’’. Collectively, these 
results demonstrate heightened levels of replication stress in cycling 
old HSCs associated with extended S phase and acquisition of chromo- 
somal gaps/breaks. In rare cases, we also observed exacerbated features 
of replication stress in old HSCs, including senescence with increased 
senescence-associated -galactosidase (SA-B-Gal) staining and Cdkn2a 
(p16) expression, and fragile telomeres with multiple telomeric signals 
(Extended Data Fig. 5a, b). 

To understand at the molecular level why old HSCs have replication 
stress, we performed microarray gene expression analyses. We compared 
both HSCs and granulocyte/macrophage progenitors (GMPs) and sub- 
tracted for genes that were differentially expressed between young and 
old GMPs. This allowed us to identify 913 significantly differentially 
expressed genes that were specific to old HSCs and segregated into four 
main clusters (Supplementary Table 2). Among those, we observed a 
selective downregulation of all MCM genes (Mcm2-7), which encode 
the six subunits of the MCM DNA helicase (Extended Data Fig. 5c, d)’®. 
Using qRT-PCR, we confirmed unchanged levels of Mcm genes in old 
GMPs, and significantly decreased expression of at least Mcm4 and Mcm6 
in both quiescent and cycling old HSCs (Extended Data Figs 5e and 6a, b). 
Moreover, we directly demonstrated a ~50% decrease in MCM4 and 
MCM6 protein levels in quiescent old HSCs (Fig. 4a, b). Interestingly, 
re-analyses of published data sets also showed decreased Mcm expres- 
sion in several old HSC samples (Extended Data Fig. 6c). The MCM pro- 
teins form a heterohexameric complex that is part of the pre-replication 
complex assembled at origins of replication during late M/early G1 phases”*. 
At the G1-to-S-phase transition, MCM proteins associate with CDC45 
and the go-ichi-ni-san (GINS) complex to form an active helicase that 
unwinds the DNA at replication forks’’. In contrast to MCM proteins, 
the expression of other pre-replication complex or DNA helicase compo- 
nents was not altered in old HSCs (Extended Data Fig. 6a). Collectively, 
these results uncovered a specific deficit in MCM proteins in old HSCs. 

One characteristic of cells with decreased MCM levels is hypersensi- 
tivity to replication stressors’*. To test the sensitivity of young and old 
HSCs, we used a low dose of the DNA polymerase inhibitor aphidicolin. 
While 36h in vitro aphidicolin treatment only had a modest effect on 
young HSCs, old HSCs displayed a massive accumulation of YH2AX 
foci, enhanced apoptosis and severely impaired colony-forming activity 
upon re-plating in methylcellulose (Fig. 4c—e). However, after transplan- 
tation, 36 h aphidicolin-treated young HSCs showed strikingly impaired 
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Figure 4 | Defective replication due to reduced MCM expression in old 
HSCs. a, b, Representative images and quantification (MFI) of MCM4 (a) and 
MCM6 (b) protein levels in young and old HSCs (yHSC and oHSC, 
respectively). c—e, Effect of low-dose aphidicolin (Aph.; 50 ngml~') on 
cultured young and old HSCs (n = 3): ¢, representative images of YH2AX foci; 
d, cleaved caspase 3 (CC3) levels; and e, colony counts in methylcellulose after 
36h treatment. Results are normalized for vehicle-treated cells (Veh.) and 
expressed as fold change compared with young HSCs (set to 1). f-h, Analyses 
of CldU/IdU-labelled DNA replication tracks in 36 h cycling young and old 
HSCs (n = 3): f, representative images (arrows indicate fork progression); 

g, individual fork velocities with means (bars); and h, box plot quantification of 
fork symmetry ratio. i-k, Lentiviral-mediated knockdown of Mcm4 and Mcm6 
in young HSCs (n = 3). Transduced green fluorescent protein (GFP)* HSCs 
were re-isolated 48 h post-infection for in vitro analyses, or used without 
re-isolation 12h post-infection for transplantation (5 mice per condition): 

i, representative images of MCM4 and MCM6 protein levels and yH2AX foci; 
j, colony counts in methylcellulose (results are expressed as fold change 
compared to scrambled shRNA (sh-scr)-infected HSCs, set to 100%); and 

k, reconstitution ability upon transplantation (results are percentage of GFP 
chimaerism normalized to the initial transduction efficiency per construct). 
UT, untransfected. Scale bars, 10 um (a-c, i); 2.5 um (f). Data are means + s.d. 
*P = 0.05, **P <0.01, ***P <0.001. 


reconstitution ability, with early onset of bone marrow failure and death, 
while both treated and untreated old HSCs displayed equally poor trans- 
plantability with reduced lymphoid output (Extended Data Fig. 7a). More- 
over, the number of engrafted 36 h aphidicolin-treated young HSCs was 
significantly reduced to levels similar to engrafted old HSCs, either treated 
or untreated (Extended Data Fig. 7b). These findings demonstrate that 
induced replication stress severely damages the functionality of young 
HSCs ina way that resembles age-associated effects, and that transplan- 
tation is the ultimate replication challenge for old HSCs. We also con- 
firmed that the differential killing of old HSCs was specific for replication 
stressors as opposed to non-specific cytotoxic agents (Extended Data 
Fig. 7c). Collectively, these results demonstrate that replication stress can 
degrade HSC function, even in young HSCs with a full complement of 
MCM proteins, and that old HSCs with reduced MCM levels are more 
susceptible to the killing effect of replication challenges both in vitro and 
in vivo. 
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Although MCM proteins are normally present in excess, downregu- 
lation of just one component is sufficient to sensitize cells to replication 
stress by reducing their capacity to activate dormant origins in response 
to collapsed replication forks"*. To directly assess replication at the single- 
molecule level, we analysed stretched DNA fibres after 5-chloro-2'- 
deoxyuridine (CldU)/5-iodo-2'-deoxyuridine (IdU) labelling of 36h 
replicating young and old HSCs (Fig. 4f and Extended Data Fig. 7d). 
Strikingly, we observed a significant increase in both replication fork 
velocity and numbers of asymmetric replication forks in old HSCs 
(Fig. 4h, g). These altered dynamics are consistent with the replication 
stress features reported in cells with impaired MCM activity’*”’, and 
probably reflect a reduced number of licensed replication origins lead- 
ing to an extended S phase in old HSCs. We then used lentiviral vectors 
containing either Mcm4 or Mcmé6 short hairpin RNA (shRNA) to deter- 
mine the effect of decreased MCM levels on young HSC function. We 
confirmed =50% knockdown both at the messenger RNA and protein 
levels, associated with accumulation of yH2AX foci, lower colony-forming 
ability and reduced expansion rates in culture (Fig. 4i, j and Extended 
Data Figs 7e, f, 8a), hence demonstrating impaired replication in trans- 
duced young HSCs. Moreover, transplantation experiments showed 
decreased reconstitution ability from transduced HSCs (Fig. 4k), which 
directly confirms that reducing MCM levels strongly impairs young 
HSC functionality. Collectively, these results reinforce previous data 
linking decreased MCM levels to impaired stem- and progenitor-cell 
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Figure 5 | Persistent nucleolar YH2AX foci in quiescent old HSCs. 

a, Representative images of YH2AX and nucleolar marker co-localization in 
young and old HSCs (yHSC and oHSC, respectively): fibrillarin (FBL); 
upstream binding factor (UBF); and nucleolin (NCL). b, Quantification of 
H2AX/FBL foci in young and old cells. ND, not detectable. c, Representative 
kinetics of nucleolar dissociation/reformation in cultured young and old HSCs 
(n = 3).d, Representative images of FBL/yH2AX staining in cultured old HSCs. 
e, Representative images of immuno-FISH for yH2AX and rDNA in young 
and old HSCs. f, RT-PCR analyses of 47S rRNA precursor transcript 
expression in quiescent (n = 12) and cycling (n = 8) young and old HSCs. 
Results are expressed as log, fold change compared with young HSCs (set to 0). 
Bars indicate average expression levels. g, Representative images of PP4c 
staining in cultured young and old HSCs. Scale bars, 10 j1m. Data are 

means + s.d. *P = 0.05, ***P = 0.001. NS, not significant. 
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proliferation”®, and identify the deficit in MCM DNA helicase compo- 
nents as the likely cause of the replication stress features of old HSCs. 
While replication stress provides an explanation for the high levels 
of yH2AX in cycling old HSCs, it does not account for YH2AX accumu- 
lation in quiescent old HSCs. To address this issue, we asked whether 
yH2AX could mark particular subnuclear structures’. Whereas no co- 
localization was found with centromeric or telomeric regions, we observed 
an almost complete co-localization of YH2AX signals with nucleolar 
markers in quiescent old HSCs (Fig. 5a and Extended Data Fig. 8b, c)”. 
Although nucleolar YH2AX signals were almost never found in young 
cells, they were occasionally observed in old multipotent progenitors 
(MPPs), but not in old GMPs or granulocytes (Fig. 5b and Extended 
Data Fig. 8d, e). One to five nucleoli can usually be observed per mouse 
cell, which result from the cell-cycle-dependent assembly of nucleolar 
organizer regions (NORs) present on four different chromosome pairs 
in the C57BL/6 mouse background”. As expected, replicating HSCs 
showed a cyclical dissociation/reformation of nucleolar structures, albeit 
with slower kinetics in old HSCs (Fig. 5c and Extended Data Fig. 8f). 
Remarkably, yH2AX signals quickly vanished from the nucleolus in cy- 
cling old HSCs even before nucleolar dissociation, and never re-appeared 
in vitro even after nucleolar reformation (Fig. 5d and Extended Data 
Fig. 8f). Nucleolar yYH2AX signals were also never observed in cycling 
old HSCs re-isolated 2 weeks after transplantation, but were readily detect- 
able in old HSCs re-isolated 7 months after transplantation, which had, 
by then, re-entered quiescence (Extended Data Fig. 9a). The nucleolus 
is primarily the site of ribosome biogenesis, where multiple repeats of 
rDNA genes present on each NOR are transcribed and then spliced to 
produce the 18S, 5.8S and 28S rRNA subunits”. We confirmed the pres- 
ence of YH2AX on rDNA genes in quiescent old HSCs using an immuno- 
fluorescence in situ hybridization (FISH) approach (Fig. 5e and Extended 
Data Fig. 8g). We also found significantly reduced expression of the 47S 
rRNA precursor transcripts by RT-PCR in quiescent old HSCs (Fig. 5f), 
and confirmed decreased ribosome biogenesis in these cells using Bio- 
analyzer track analyses (Extended Data Fig. 9b). Moreover, we observed 
restoration of 47S rRNA precursor transcript expression to levels found 
in young HSCs in cycling old HSCs that had lost nucleolar YH2AX 
(Fig. 5f). Taken together, these results indicate that nucleolar YH2AX 
signals are an exclusive feature of quiescent old HSCs, which correlate 
with decreased ribosome biogenesis and could mark the transcriptional 
silencing of rDNA genes. In contrast, none of the classic histone methy- 
lation marks associated with active or repressed transcription displayed 
specific nucleolar enrichment in old HSCs (Extended Data Fig. 9c). 
rDNA genes are the most abundant and highly transcribed genes in 
eukaryotes, which contain many replication origins and are known to 
challenge the replication machinery”. Itis therefore likely that replicat- 
ing old HSCs accumulate yYH2AX on rDNA genes, and we propose that 
their aggregation during nucleolar reformation causes the appearance 
of nucleolar-associated yYH2AX signals in quiescent old HSCs. Impor- 
tantly, single-nucleotide polymorphism (SNP) analyses of amplified geno- 
mic DNA did not reveal significant differences in rDNA sequences 
between young and old HSCs (data not shown), which suggests that 
replication stress has little to no mutagenic consequence for rDNA gene 
integrity. In addition, we found that PP4c, one of the best-characterized 
yH2AX phosphatases”, was strikingly mislocalized in quiescent old 
HSCs. While nuclear PP4c was observed in both quiescent and cycling 
young HSCs, PP4c was found almost exclusively in the cytoplasm of 
quiescent old HSCs and only became nuclear when old HSCs re-entered 
the cell cycle (Fig. 5g and Extended Data Fig. 9d). Nuclear re-localization 
of PP4c also occurred within the same time window (3-9 h) as disap- 
pearance of yH2AX from the nucleolus in cycling old HSCs. Thus, it is 
conceivable that the long-term persistence of nucleolar yYH2AX in qui- 
escent old HSCs results from ineffective H2AX dephosphorylation 
due to mislocalized PP4c rather than ongoing DNA damage. Although 
we observed the presence of unrepaired, RPA-coated stretches of single- 
stranded DNA in quiescent old HSCs, they do not appear to trigger the 
DDR in these cells, in contrast to what has been described in other 
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contexts”. They might also be the source of the DNA breaks detected by 
alkaline comet assays in a recent study of quiescent old HSCs”, but again 
without evidence of an activated DDR. A failure to dephosphorylate 
H2AX could therefore explain why quiescent old HSCs show persistent 
YH2AX signals without DDR activation. 

Our results demonstrate that replication stress is a potent driver of 
functional decline in old HSCs, and identify a deficit in MCM helicase 
components as the molecular mechanism for the impaired replication 
of old HSCs (Extended Data Fig. 10). It will now be important to under- 
stand why expression of Mcm genes decreases with age in HSCs, and 
whether this could be reversed through direct changes in old HSCs or 
rejuvenation of the ageing bone marrow niche”. Our results also sug- 
gest a non-canonical function for YH2AX as an epigenetic histone mod- 
ification that marks the silencing of the transcription machinery. This 
could be a normal mechanism to block transcription in genomic regions 
undergoing DNA repair, and further studies will address the broad rele- 
vance of this novel finding. It will also be interesting to determine whether 
decreased rDNA gene transcription in quiescent old HSCs plays a part 
in bone marrow failure syndromes and other age-related blood defects 
linked to defective ribosome function”*. In this context, it will be impor- 
tant to understand why PP4c is mislocalized in quiescent old HSCs, and 
whether this can be reverted for therapeutic purposes. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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Historical contingency and its biophysical basis in 
glucocorticoid receptor evolution 
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Understanding how chance historical events shape evolutionary pro- 
cesses is a central goal of evolutionary biology’. Direct insights into 
the extent and causes of evolutionary contingency have been limited 
to experimental systems’ °, because it is difficult to know what hap- 
pened in the deep past and to characterize other paths that evolution 
could have followed. Here we combine ancestral protein reconstruc- 
tion, directed evolution and biophysical analysis to explore alterna- 
tive ‘might-have-been’ trajectories during the ancient evolution ofa 
novel protein function. We previously found that the evolution of 
cortisol specificity in the ancestral glucocorticoid receptor (GR) was 
contingent on permissive substitutions, which had no apparent effect 
on receptor function but were necessary for GR to tolerate the large- 
effect mutations that caused the shift in specificity®. Here we show that 
alternative mutations that could have permitted the historical function- 
switching substitutions are extremely rare in the ensemble of genotypes 
accessible to the ancestral GR. In a library of thousands of variants 
of the ancestral protein, we recovered historical permissive substi- 
tutions but no alternative permissive genotypes. Using biophysical 
analysis, we found that permissive mutations must satisfy at least 
three physical requirements—they must stabilize specific local ele- 
ments of the protein structure, maintain the correct energetic bal- 
ance between functional conformations, and be compatible with the 
ancestral and derived structures—thus revealing why permissive mu- 
tations are rare. These findings demonstrate that GR evolution de- 
pended strongly on improbable, non-deterministic events, and this 
contingency arose from intrinsic biophysical properties of the protein. 

Historians and evolutionary biologists have long wrestled with the idea 
that historical outcomes may hinge on chance events. How differently 
would the world have turned out if the Persian cavalry had been present 
at the Battle of Marathon or if the KT asteroid had missed the Earth? In 
biology, evolutionary trajectories driven solely by the deterministic force 
of natural selection will always produce the optimal accessible form, ir- 
respective of chance events*”®. In contrast, when non-deterministic pro- 
cesses such as drift play a strong part, the outcome depends on whatever 
chance events occur during evolution; if history could be set in motion 
again from some past starting point, very different results would prob- 
ably unfold. 

Recent studies show that the evolution of some protein functions was 
contingent on prior ‘permissive’ mutations, which are functionally neu- 
tral in isolation but must be present for the function-altering mutations 
to be tolerated®””"""’. Permissive mutations cannot be fixed by selection 
for the derived function and must therefore accumulate stochastically 
with respect to it. It remains unknown, however, how many permissive 
mutations could have enabled these evolutionary transitions and there- 
fore whether the dependence on non-deterministic events is strong or 
weak. If the suite of potential permissive mutations is large, then many 
different evolutionary paths could enable the function-switching muta- 
tions, and the outcome of protein evolution would be only weakly con- 
tingent on its specific history. Conversely, if only a few mutations have 
the potential to permit the realized outcome, the probability that one of 
these would occur by chance would be very small, and the particular 


form and function achieved by the evolving protein would be strongly 
contingent on a low-probability event. 

Understanding evolutionary contingency requires measuring the num- 
ber of potentially permissive mutations and characterizing the factors 
that determine that number. Because history happened only once, this 
knowledge has been inaccessible for natural biological systems that evolved 
in the deep past. We addressed this issue by reconstructing ancestral 
proteins and subjecting them to directed evolution, a protein engineer- 
ing strategy to efficiently characterize regions of protein sequence space 
with respect to some function of interest'®’”. We then employed bio- 
physical analyses to explore the mechanistic factors that determined the 
number of permissive genotypes. 

We previously characterized an evolutionary transition in the GR 
ligand-binding domain (LBD) of bony vertebrates and found that it was 
contingent on permissive mutations®. The LBD serves as an allosteri- 
cally regulated transcriptional activator: hormone binding causes the 
‘activation-function helix’ (AF-H) to pack against the body of the protein, 
creating a new surface to which coactivator proteins bind, and increas- 
ing the transcription of nearby target genes'*””. Using ancestral protein 
reconstruction, we previously found that the cortisol-specific GR evolved 
from a promiscuous ancient receptor (AncGR1) because of seven his- 
torical substitutions that are conserved in all extant GRs (Fig. 1a, b)°. Of 
these, five function-switching substitutions (denoted F) eliminated the 
response to other hormones by repositioning a helix (H7) along one side 
of the binding cavity and establishing new cortisol-specific contacts. In- 
troducing the F substitutions into AncGRI, however, rendered the pro- 
tein non-functional (Fig. 1b). The remaining two historical substitutions 
(P) were permissive: they had no detectable effect on receptor function 
when introduced into AncGRI, but they allowed F to be tolerated, 
yielding a cortisol-specific receptor (Fig. 1b). Contingency is apparent, 
because selection for cortisol specificity could not deterministically drive 
the acquisition of P, which was required for the subsequent evolution 
of F and the domain’s derived structure and function. It is unlikely that 
the evolving GR passed through a non-functional intermediate con- 
taining F without P*°, because the LBD remained conserved, presum- 
ably asa result of functional constraints, for ~40 million years from the 
gene duplication event that generated it until the evolution of its new 
function (see ref. 21). 

To understand the strength of contingency, we used directed evolu- 
tion to estimate the frequency of alternative permissive mutations (P’) 
ina large library of ancestral protein variants. Permissive mutations must 
fulfil two criteria (Fig. 1c): they must rescue the non-functional AncGR1 
+F protein, allowing it to tolerate the F mutations, and they must be 
compatible with the ancestral sequence and function when introduced 
into AncGRI. To screen for rescuing mutations that meet the first cri- 
terion, we generated a large library of random mutants of AncGR1+F 
and characterized the resulting distribution of amino acid replacements 
(Extended Data Figs 1 and 2). We screened this library with a yeast two- 
hybrid system that linked growth to the cortisol-dependent interaction 
of the LBD with its coactivator peptide**”*. We applied a liberal stand- 
ard of growth to capture all rescuing mutations and verified their effects 
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Figure 1 | Searching for alternative permissive mutations in an ancestor of 
GR. a, Evolution of hormone specificity in vertebrate GRs’. Icons indicate taxa 
(tetrapods, teleosts, elasmobranchs); circles show sensitivity to cortisol (purple) 
or 11-deoxycorticosterone (orange). The transparent box represents the 
evolution of new function. b, Seven historical substitutions recapitulate the shift 
in specificity. Two permissive mutations (P), which have no effect on specificity 
when introduced alone, allow AncGR1 to tolerate five function-switching 
mutations (F)°. Spheres are coloured by primary ligand (orange, 11- 
deoxycorticosterone; purple, cortisol), or no activation (grey). Thick bars 
connect functional proteins; thin bars lead to non-functional proteins. Arrows 
represent evolutionary paths that pass only through functional intermediates. 
c, Historical (P) or alternative permissive (P’) mutations rescue AncGR1+F 
and are tolerated in the ancestral background. Non-permissive pathways pass 
through non-functional intermediates (A and B, grey spheres) or fail to rescue F 
(C). Inset: screening conditions in yeast that identify AncGR1+F variants that 
confer growth in 1 LM cortisol, compared with vehicle-only control. 


in both naive yeast and a mammalian reporter assay (Fig. 1c and Ex- 
tended Data Fig. 3). We screened ~ 12,500 clones, comprising an esti- 
mated 1,025 unique single replacements (71% ofall accessible neighbours), 
1,802 unique double replacements and 825 higher-order combinations 
(3,650 total; see Methods and Extended Data Figs 1 and 2); the remain- 
der were duplicate clones or contained nonsense, frameshift or zero non- 
synonymous mutations. We found no evidence of bias in the library 
(Extended Data Fig. 2 and Methods). 

This screen identified 12 unique clones that improved AncGR1+F’s 
sensitivity to cortisol. These clones carried one, two or three mutations 
each, but dissection of the combinations showed that functional effects 
were due entirely to single mutations that co-occurred with neutral 
changes (Extended Data Fig. 4). In total, we found ten unique single mu- 
tations that completely or partly rescued cortisol sensitivity. Two of 
these involved historically substituted residues: one was a historical P 
substitution (n26T, with upper and lower cases denoting derived and 
ancestral states), and the other reverted one F substitution to its ances- 
tral state (198f), conferring partial growth in the absence of permissive 
mutations (Extended Data Fig. 3). Of the novel rescuing mutations, 
three (M2221, M222L and L231M) improved the cortisol-sensitivity of 
AncGRI + F tenfold or more, an effect as great as historical P (Fig. 2a). 
The remaining five mutations improved cortisol sensitivity twofold to 
threefold each, comparable to the individual members of P, but much 
less than the pair together (Fig. 2a). To see whether pairing any of the 
small-effect substitutions could recapitulate the effect of P, we gener- 
ated all twofold combinations of the weak rescuing mutations. Only 
one pair (Q114L/M197I) affected cortisol sensitivity similarly to the 
historical set P (Fig. 2a). The screen therefore recovered four alterna- 
tive rescuing combinations—one double and three single mutants— 
indicating that rescuing mutations are rare, on the order of 4 in 3,650, 
or ~0.1%. 

To determine whether the rescuing mutations discovered in the screen 
met the second criterion for permissive mutations—functional com- 
patibility with the ancestral genetic background—we introduced them 
into AncGR1 and characterized their effects on hormone-dependent 
activation. Unlike the historical permissive mutations, all four rescuing 
mutations disrupted the ancestral protein’s ligand-regulated transcrip- 
tional function. The large-effect rescuing mutations each caused transcrip- 
tional activation even in the absence of hormone, and caused promiscuous 
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Figure 2 | Rescuing mutations disrupt the ancestral protein’s function. 

a, Effects of rescuing mutations on cortisol sensitivity in AncGR1+F. 
Sensitivity is defined as the ratio of the mutant to the AncGR1+F 
concentrations giving half-maximal response (ECso) in a luciferase reporter 
assay. Results are shown as means and s.e.m. for the number of experimental 
replicates indicated by grey circles. Green, historical P substitutions, with effect 
shown by dotted line; rescuing mutations from the screen are coloured by 
their structural location (see Fig. 3c). b, Rescuing AF-H mutations disrupt 
AncGRI regulation. Fold reporter activation with progesterone over vector- 
only control is shown for AncGR1 (grey), historical P (green) and 3 AF-H 
mutations (pink shades, corresponding to inset graph). Results are shown as 
means and s.e.m. for three technical replicates. Inset, fold activation for mutants 
with no hormone (vehicle only). ¢, Q114L/M197I abolishes activation. The 
fold activation in 1 4M 11-deoxycorticosterone (11-DOC) or cortisol versus 
vehicle is shown. Results are means + s.e.m. for three technical replicates. 


activation in response to low doses of other steroids such as proges- 
terone (Fig. 2b), a natural hormone excluded by all known extant and 
ancestral corticosteroid receptors. The pair Q114L/M197I destroyed 
AncGRI’s transcriptional function entirely, making it unable to activ- 
ate reporter expression even at high hormone concentrations (Fig. 2c). 

Permissive mutations are therefore extremely rare. Among ~3,660 
unique protein variants (~3,650 in the screened library plus 10 engi- 
neered double mutants), zero permissive genotypes were present. One 
permissive combination, the historical set P, exists in the universe of 
sequences near AncGRI1, so we estimate an upper bound frequency of 
accessible permissive pathways of less than 1 in 3,660 (0.03%). The total 
frequency is probably far lower, because knowledge of this one permis- 
sive pathway was not acquired by sampling. Further, our screen of double 
mutants was biased towards the discovery of rescuing variants, because 
it included engineered combinations of all single mutations that had a 
detectable rescuing effect. The universe of possible variants containing 
two or more replacements is very large, so alternative permissive sets 
may exist; however, these genotypes would require multiple independ- 
ent substitutions, and the joint probability of such events would be very 
low because they cannot be acquired deterministically by selection for 
the derived function. A permissive mutation might conceivably be sub- 
ject to selection for some other function; however, unless the selected 
and derived functions are correlated, the probability that selection would 
deterministically fix a compound permissive genotype is extremely low. 
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Evolution of the F mutations was therefore strongly contingent on prior 
low-probability events. 

To understand the mechanisms that make permissive mutations both 
necessary and rare, we characterized the biophysical effects of F, P, and 
the four sets of rescuing but non-permissive mutations. Permissive mu- 
tations are often thought to act through effects on the global stability of 
folding: function-switching mutations destabilize a protein, making it 
prone to degradation and aggregation, but permissive mutations increase 
stability, and offset this effect'*'*7*”°. Structural considerations suggested 
that a stability tradeoff might explain the effects of F and P. The F mu- 
tations cause a 3 A shift in the position of helix H7 relative to H10 and 
the ligand, disrupting numerous contacts; they also open empty space 
between the ligand and helix H3 and remove a hydrogen bond from 
the key loop that connects AF-H to H10 (refs 21, 26). In contrast, the P 
mutations add favourable interactions—both a new hydrogen bond 
and improved packing interactions—in the crystal structure and in mo- 
lecular dynamics (MD) simulations (Extended Data Fig. 5). To elucid- 
ate the effects of F and P on stability, we measured the midpoint of 
irreversible thermal denaturation (T,,) of steroid-bound AncGR1 con- 
taining each of the historical F and P mutations. As expected, each F 
mutation except 1111Q was destabilizing (Extended Data Fig. 6a), and 
the P mutations were stabilizing (Fig. 3a). 

Although these data are consistent with the global stability model, 
several other observations are inconsistent with it. First, the F and P mu- 
tations did not affect expression in mammalian cells as measured by 
western blot analysis (Extended Data Fig. 6b), indicating that AncGR1- 
Fis functionally compromised rather than subject to degradation or ag- 
gregation because of reduced stability. Second, under the global stability 
model, rescuing mutations should be more frequent than we observed. The 
global model predicts that any stabilizing mutation should be permissive*”, 
and it is estimated that 1-10% of mutations are stabilizing”; however, 
only ~0.1% of our library was rescuing, and permissive mutations were 
even rarer. Third, the global stability model predicts that any rescuing 
mutation should also be permissive, but we found that several rescuing 
mutations were deleterious when introduced without the function- 
switching mutations. Finally, the rescuing mutants all increased the T,, 
of AncGR1+F more than they did in AncGR1, suggesting a specific 
epistatic effect rather than a generic compensatory mechanism (Fig. 3b 
and Extended Data Table 1). These observations all indicate that per- 
missive mutations must do more than simply increase global stability. 

To understand the requirements that permissive mutations must 
fulfil, we first examined the location of permissive and rescuing muta- 
tions in the protein’s structure. Under the global stability model, a sta- 
bilizing mutation should be permissive, irrespective of its location**”®. 
In contrast, the permissive and rescuing mutations exhibited a striking 
structural distribution, occurring in two distinct clusters near the F 
mutations: ‘pocket’ substitutions bordering the ligand cavity, and ‘AF- 
H’ substitutions at the interface between AF-H and the rest of the 
protein (Fig. 3c). Both the ancestral crystal structures and MD simula- 
tions showed that the historical P mutations yielded new favourable 
contacts involving the same structural elements destabilized by F (Ex- 
tended Data Fig. 5). Specifically, Thr 26 strengthens a hydrogen bond 
connecting helix H3 to the H10/AF-H loop, compensating for the loss 
ofa hydrogen bond in this loop as a result of F mutation s212A. Leu 105 
improves packing interactions between helices H3 and H7, apparently 
compensating for the effects of the other F mutations on the interac- 
tions between H3, H7 and the ligand. Similarly, all rescuing mutations 
we discovered in our screen improved packing interactions involving 
AF-H or H7 (Fig. 4 and Extended Data Figs 7 and 8). 

These observations suggest that permissive mutations must stabilize 
specific local structural elements destabilized by F, rather than generic- 
ally modulating global stability. To test this hypothesis, we used the struc- 
ture to identify a potentially stabilizing pair of mutations (E165A and 
K168E) ~25 A distant from the ligand pocket and AF-H (Fig. 3c). We 
introduced them into AncGR1 + F and found that they raised T,,, by 1.4 °C; 
rather than rescuing function, however, they impaired AncGRI+F’s 
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Figure 3 | Permissive mutations must stabilize local structural elements. 

a, b, Effect of rescuing mutations on T,, values of AncGR1+F (a) and AncGR1 
(b). Colours correspond to structural position in c. ¢, Structural distribution of 
mutations on AncGR1 (PDB 3RY9). Spheres, Ca atoms. Red, historical F 
substitutions; green, historical P; blue, rescuing ligand-pocket mutations; pink, 
rescuing AF-H mutations; yellow, distant mutations that stabilize but do not 
rescue. Purple sticks show cortisol; helices are indicated. d, Change in cortisol 
sensitivity caused by E165A/K168E in AncGR1+F (yellow bar). Effects of P 
and M222L are shown for comparison. AT,, values relative to AncGR1+F are 
shown. Results in a, b and d are shown as means and s.e.m. for the experimental 
replicates indicated by grey circles. 


cortisol sensitivity roughly tenfold (Fig. 3d). These data confirm that 
increasing global stability is not sufficient to yield a permissive effect 
and point to a biophysical requirement that limits the number of per- 
missive mutations: they must exert specific local rather than generic 
global effects on protein stability. 

This requirement explains why rescuing mutations were few, but it 
does not explain why they were functionally incompatible with AncGR1, 
suggesting that further biophysical requirements limit the number of 
permissive mutations. To elucidate these requirements, we first exam- 
ined the mechanisms by which the large-effect rescuing mutations make 
the ancestral protein super-active. All three increased the stability of 
both AncGR1 and AncGRI1 +F (Fig. 3a, b) and are clustered on AF-H, 
suggesting that they exert their effect by disrupting ligand-induced allo- 
steric regulation of this helix’s position (Fig. 3c), which differentiates 
inactive and active conformations. For a properly regulated receptor 
without ligand, the inactive conformation is more stable than the active 
conformation and thus the dominant species (Fig. 4a); binding of hor- 
mone stabilizes the active conformation, causing it to become dom- 
inant. To test whether the AF-H mutations unconditionally stabilized 
the active conformation, we performed MD simulations of these muta- 
tions in AncGR1 in the absence of ligand. As predicted, M222I and 
M222L improved hydrophobic packing between the active position of 
AF-H and helix H3 (Fig. 4b, c), and L231M introduced a new sulphur-n 
interaction, anchoring AF-H in the active position against H10 (Extended 
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Figure 4 | Biophysical requirements make some rescuing mutations 
intolerable in the ancestral protein. a, A simple thermodynamic model 
explains why AF-H mutants lead to activity in the absence of hormone. The 
protein can exist in inactive (grey) or active (green) microstates, which are 
differentiated by AF-H’s position (blue). For each genotype, the relative free 
energy (AG) of active and inactive states is shown with or without hormone. 
Populated states are opaque; unpopulated states are faded. b, Snapshot from 
MD trajectory of AncGR1+M222I shows tight packing interaction between 
Ile 222 (pink) and the rest of the protein. Blue, AF-H; grey, surface that AF-H 
contacts. c, Distribution of atom contacts (centre-to-centre distances 3.5 Aor 
less) between AF-H and the rest of the protein over three replicate MD 
trajectories for AncGR1+F (black), +P (green) and + M222] (pink). The y axis 
is frequency. d, Change in position of H7 with respect to H10 from ancestral to 
derived GRs changes the effects of mutations Q114L/M197I from incompatible 
to rescuing (blue spheres). Structures are AncGR2 (left, PDB 3GN§8) and 
AncGRI (right, PDB 3RY9) with side chains at these sites introduced (spheres). 


Data Fig. 7). Stabilizing the active conformation relative to the inactive 
conformation is expected to increase the proportion of the protein in the 
active conformation, explaining why these mutations imparted activity 
in the absence of ligand and made the receptor highly sensitive to for- 
merly weak ligands (Fig. 4a). These observations point to a second lim- 
iting requirement: permissive mutations must not alter the energetic 
balance between functional conformations of the protein. That is, they 
must stabilize the ‘right’ portions of the protein without stabilizing the 
‘wrong’ portion. The global stability model does not account for these 
constraints because GR function depends not only on the stability of 
folded versus unfolded or misfolded forms but also on the stabilities of 
active versus inactive conformations in both the presence and the ab- 
sence of ligand. 

Finally, we examined why the rescuing pair Q114L/M197] rendered 
the ancestral protein non-functional (Fig. 2c). These sites are near the 
ligand-binding pocket, facing each other on helices H7 and H10 (Fig. 4d). 
In the presence of F, the two residues are slightly offset, and the rescuing 
states Leu 114 and Ile 197 improve hydrophobic packing between H7 
and H10, explaining their observed positive effect on the derived pro- 
tein’s stability and sensitivity (Extended Data Fig. 8). In the AncGR1 
structure, however, the shifted position of H7 places these two residues 
directly across from each other: the large side chains of the rescuing 
residues clash and destabilize the H7/H10 interaction (Fig. 4d). As pre- 
dicted by this model, the pair of rescuing states increases the T,, of 
AncGRI1+ F but lowers that of AncGR1 (Fig. 3b). These observations 
reveal a final requirement: permissive mutations must be compatible 
with the conformations of both the ancestral and derived proteins. 

Evolutionary contingency has usually been discussed in terms of chance 
external forces, such as random extinction by asteroid impacts or climate 
change’. Our results show that the internal organization of biological 
systems—in this case, a protein’s structure and thermodynamics—can 
give rise to strong contingency during evolution. The F mutations that 
triggered GR’s functional transition required permissive mutations to 
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stabilize the specific local structural elements that F destabilized, with- 
out disturbing the energetic balance between the receptor’s functional 
conformations or clashing with ancestral or derived protein structures. 
Our data indicate that very few mutations can satisfy all these biophys- 
ical requirements, making GR’s evolution dependent on rare, low- 
probability historical events. 

Our findings point to strong contingency in the evolution both of 
GR’s primary sequence and of its molecular form—the structural and 
mechanistic underpinnings that produce the protein’s function. GR’s 
cortisol specificity was achieved by a unique repositioning of H7 and 
the reorganization of numerous hormone contacts. If other F-like mu- 
tations exist that could produce a form and function similar to those of 
the modern GR, these mutations would reorganize and destabilize the 
same local elements of the ligand-receptor complex. To be tolerated, 
these effects would have to be offset by permissive mutations. The per- 
missive mutations, in turn, would be subject to the same biophysical 
constraints as the historical permissive mutations, because those con- 
straints arise from the functional form itselfand the fundamental archi- 
tecture of the GR LBD. Our experiments establish that very few accessible 
genotypes satisfy these constraints. Permissive sequence changes that 
could enable alternative ways of achieving a similar form and function— 
even using entirely different mutations—would therefore also be very rare. 

If evolutionary history could be replayed from the ancestral starting 
point, the same kind of permissive substitutions would be unlikely to 
occur. The transition to GR’s present form and function would probably 
be inaccessible, and different outcomes would almost certainly ensue. 
Cortisol-specific signalling might evolve by a different mechanism in 
the GR, or by an entirely different protein, or not at all; in each case, 
GR—or the vertebrate endocrine system more generally—would be sub- 
stantially different. Because GR is the only ancestral protein for which 
alternative evolutionary trajectories to historically derived functions 
have been explored, the generality of our findings is unknown. The spe- 
cific biophysical constraints, and in turn the degree and nature of con- 
tingency, that shape the evolution of other proteins are likely to depend 
on the particular architecture of each protein and the unique historical 
mechanisms by which its functions evolved. 


METHODS SUMMARY 


The AncGRI+F mutant library was generated by GeneMorphII-EZClone, using 
conditions to maximize single and double mutations. We characterized the library’s 
composition by sequencing random clones. For yeast two-hybrid screening, we 
cloned the LBD library into pBDGAL4 and the human steroid receptor coactivator 
peptide SRC-1 into pADGAL4 (refs 22, 23). Clones showing any growth in the screen 
were retransformed into naive yeast and characterized for hormone-dependent 
growth, then subcloned into pSG5-DBD, transfected into CHO-K1 cells, and assayed 
using a dual-luciferase reporter. Additional genotypes were generated by Quikchange 
mutagenesis. Proteins were expressed as His-tagged fusions with maltose-binding 
protein (MBP), then cleaved and purified to more than 99% purity by sequential 
affinity chromatography. We followed irreversible thermal denaturation using cir- 
cular dichroism. For MD simulations, three independent 100-ns trajectories were 
performed for each genotype, using GROMACS 4.5.5 and the CHARMM27 force 
field starting from equilibrated crystallographic coordinates with or without in silico 
mutations. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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DENR-MCT-1 promotes translation re-initiation 
downstream of uORFs to control tissue growth 


Sibylle Schleich”, Katrin Strassburger'*, Philipp Christoph J aniesch?*, Tatyana Koledachkina’, Katharine K. Miller’, 
Katharina Haneke'*, Yong-Sheng Cheng’, Katrin Kiichler”, Georg Stoecklin'*, Kent E. Duncan** & Aurelio A. Teleman'* 


During cap-dependent eukaryotic translation initiation, ribosomes 
scan messenger RNA from the 5’ end to the first AUG start codon with 
favourable sequence context’”. For many mRNAs this AUG belongs 
to a short upstream open reading frame (uORF)’, and translation of 
the main downstream ORF requires re-initiation, an incompletely 
understood process’* *. Re-initiation is thought to involve the same 
factors as standard initiation”. It is unknown whether any factors 
specifically affect translation re-initiation without affecting standard 
cap-dependent translation. Here we uncover the non-canonical ini- 
tiation factors density regulated protein (DENR) and multiple copies 
in T-cell lymphoma-1 (MCT-1; also called MCTS1 in humans) as the 
first selective regulators of eukaryotic re-initiation. mRNAs contain- 
ing upstream ORFs with strong Kozak sequences selectively require 
DENR-MCT-1 for their proper translation, yielding a novel class of 
mRNAs that can be co-regulated and that is enriched for regulatory 
proteins such as oncogenic kinases. Collectively, our data reveal that 
cells have a previously unappreciated translational control system with 
a key role in supporting proliferation and tissue growth. 

Cellular protein abundance depends largely on mRNA translation’. 
Little is known about how translation of specific sets of mRNAs can be 
coordinately regulated”"®. mRNAs with uORFs require re-initiation'*“, 
whereby ribosomes translate the ORF, terminate and then restart trans- 
lating the main ORF'**"*. No metazoan trans-acting factors have yet been 
described that selectively affect re-initiation, enabling coordinate regu- 
lation of uUORF-containing mRNAs. 

eIF2D (also called ligatin) and the related DENR-MCT-1 complex are 
candidate re-initiation regulators. They associate with 40S ribosomal sub- 
units and have domains implicated in RNA binding and start codon recog- 
nition (Extended Data Fig. 1a). In vitro they can recycle post-termination 
complexes, recruit initiator methionyl-tRNA (Met-tRNA,M“) tomRNAs 
containing viral internal ribosome entry sites'*’’, and affect movement of 
post-termination 80S complexes to nearby AUG codons*. DENR-MCT-1 
has not previously been implicated in re-initiation, and MCT-1 is an onco- 
gene affecting cellular mRNA translation by an unclear mechanism'*”. 
Collectively, these studies suggest that DENR-MCT-1 and eIF2D might 
regulate translation of cancer-relevant mRNAs through non-canonical 
mechanisms. 

To study DENR function, we generated Drosophila knockouts for 
the homologous gene (CG9099), lacking DENR transcript or protein 
(DENR®°, Extended Data Fig. 1b-d). DENR®® flies die as pharate adults 
with a larval-like epidermis (Fig. 1a), due to impaired proliferation of 
histoblast cells (Fig. 1b and Extended Data Fig. le). This is rescued by 
expressing DENR ubiquitously (Tubulin-GAL4) or specifically in his- 
toblast cells (Escargot-GAL4) ( 7? test P< 0.05, Extended Data Fig. 1f). 
Although DENR is expressed ubiquitously (Extended Data Fig. 1g), quickly 
proliferating histoblast cells appear more sensitive to DENR loss than 
non-proliferating tissues. DENR*° flies also have crooked legs and incor- 
rectly rotated genitals (Fig. 1c and Extended Data Fig. 1h-h’). These phe- 
notypes are not observed in mutants with generally impaired translation 


(Minutes; ref. 21), but are found in flies with reduced cell-cycle regulators 
or Ecdysone Receptor signalling”, suggesting that DENR affects trans- 
lation of a subset of mRNAs involved in cell proliferation and signalling. 

Similar phenotypes were observed in flies expressing RNA interfer- 
ence (RNAi) targeting MCT-1 (fly homologue CG5941; Extended Data 
Fig. 1i), which like human MCT-1 binds DENR (Extended Data Fig. 1)). 
Reducing ligatin (fly homologue of human eIF2D) gene dosage in DENR*° 
flies caused fewer animals to reach pupation (y’ test P< 0.05, Extended 
Data Fig. 1k), indicating that DENR*° phenotypes result from loss of 
DENR-MCT-1 complex with eIF2D-like activity. 

DENR*® larvae and DENR knockdown S82 cells grow slowly with 
reduced protein accumulation rates (Fig. 1d, e, g). Mutant polysome pro- 
files show reduced polysome/monosome ratios (Fig. 1f, h and Extended 
Data Fig. 1l-n’), suggesting defective translation initiation. Despite more 
ribosomes and initiator tRNA per cell (Extended Data Fig. 10, p), DENR 
knockdown cells have reduced protein synthesis rates when proliferat- 
ing (Fig. 1i). When quiescent, DENR knockdown cells no longer display 
these phenotypes, and become enlarged compared to controls (Fig. 1h, i 
(right panel) and Extended Data Fig. 1q-r). Thus, DENR promotes trans- 
lation of cellular mRNAs in proliferating but not quiescent cells. 

We identified ~ 100 mRNAs requiring DENR for efficient translation 
by profiling actively translated mRNAs from 80S and polysome fractions 
of control and DENR knockdown cells and normalizing to total mRNA 
(Supplementary Table 1). We further analysed myoblast city (mbc) because 
it was the second most under-translated mRNA and we could obtain anti- 
body to detect it. Quantitative polymerase chain reaction with reverse 
transcription (RT-PCR) confirmed that mbc mRNA is under-represented 
in polysomes of DENR knockdown cells (Fig. 2a), leading to reduced 
Mbc protein but not mRNA (Fig. 2b), whereas other proteins were not 
reduced (Extended Data Fig. 2a). The mbc 5’ UTR was sufficient to impart 
DENR-dependence to a Renilla luciferase (RLuc) reporter (Extended Data 
Fig. 2b). This DENR dependence requires the 5’ cap and is not accom- 
panied by a drop in general translation (Extended Data Fig. 2c, d’ and 
Fig. 2c). Combined knockdown of DENR and MCT-1 had no additive 
effect, as they are a functional complex (Extended Data Fig. 2e). In sum, 
the DENR-MCT-1 complex selectively promotes cap-dependent trans- 
lation of mbc via its 5’ UTR. 

Systematic 5’ UTR truncations (Extended Data Fig. 2f-h) identified 
175 nucleotides necessary and sufficient for DENR dependence (Fig. 2d 
and Extended Data Fig. 3a, b) containing 3 uORFs with strong Kozak 
sequences (stuORFs, red boxes in Fig. 2d). Mutating all three stuORF 
ATGs, or their Kozak sequences, abolished DENR dependence (Fig. 2e, f 
and Extended Data Fig. 3c), indicating that translation initiation on these 
stuORFs is necessary for DENR dependence. No additional cis-acting 
sequences were necessary; removing sequences upstream, downstream, 
or between the uORFs, or mutating the uORF coding sequences, did not 
affect DENR dependence (Fig. 2g and Extended Data Fig. 2f-h). Two 
possible explanations are: (1) DENR promotes bypass of stuORF initi- 
ation codons; and (2) DENR affects re-initiation after stuORF translation. 
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Figure 1 | DENR promotes cell proliferation and 
boosts protein synthesis in proliferating but not 
quiescent cells. a, DENR*° flies die as pharate 
adults with larval-like abdominal epidermis. 
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correct cell numbers at onset of pupation (0h 
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act at the stuORF start codon. In model 2 the stuORF stop codoniscru- to impart DENR dependence, with multiple stuORFs acting additively 
cial as translation re-initiation on the main ORF only occurs aftertermi- (Fig. 3a). Re-initiation efficiency is reportedly inversely related to UORF 
nation on the stuORF. Two point mutations removing the stuORF stop _ length, presumably because initiation factors dissociate from ribosomes 
codons completely abolished DENR dependence (Fig. 2h and Extended _ as elongation proceeds’. Consistently, the ability of DENR to promote re- 
Data Fig. 3d), indicating that DENR promotes translation re-initiation. _ initiation dropped as ORFs became longer (Fig. 3a, right panel), reaching 
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Figure 2 | DENR promotes re-initiation of 
translation downstream of uORFs in the mbc 5’ 
UTR. a, qRT-PCR validation that the mbc mRNA 
is preferentially depleted from polysomes in DENR 
knockdown cells (DENR A and B) compared to 
controls (GFP A and B). b, Mbc protein (left) but 
not mRNA (right) levels are reduced in DENR and 
MCT-1 knockdown cells. c, Translation extracts 
from DENR knockdown cells are impaired in 
translating a reporter containing the mbc 5' UTR 
(left) but not in translating a control RLuc reporter 
mRNA without uORFs in the 5’ UTR (right). 

d, Schematic overview of the mbc 5’ UTR and the 
tested DNA reporter constructs, summarizing 
results from other panels as well as multiple (=3) 
additional replicates on all the luciferase assays, 
not shown. Details in Extended Data Fig. 3. 

e, f, Mutating the start codons of the three mbc 
uORFs with strong Kozak sequences (e) or their 
Kozak sequences to the less functional gtgtATG 
(f) blunts regulation by DENR. g, Mutating the 
coding sequence of mbc uORFs to poly-glutamine 
has no effect on DENR regulation. h, Mutation 
of the stop codons of mbc uORFs 218, 248 and 338, 
as diagrammatically shown in d, causing the 
uORFs to extend past the RLuc ATG, leads to 
loss of DENR-dependent regulation. i, DENR 
knockdown leads to impaired expression of the 
mbc 5' UTR RLuc reporter in proliferating but not 
quiescent S2 cells. Error bars: s.d. t-test *P < 0.05, 
***D < 0.001. 
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Figure 3 | uORFs with strong Kozak sequences (stuORFs) are sufficient 
to impart DENR-MCT-1-dependent regulation. a, Introduction of 
synthetic uORFs bearing a ‘strong’ Kozak into a control 5’ UTR imparts 
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zero effect on a dicistronic transcript containing a long upstream ORF 
(not shown). In sum, mRNAs display a continuum of DENR dependence, 
depending on the number and length of the uORFs and the strength of 
their Kozak sequences. A computational search revealed thousands of 
5’ UTRs containing uORFs (Extended Data Fig. 4a). We generated a pre- 
dicted ‘DENR-dependence score’ for all transcripts based on the number 
of uORFs they contain and the strength of their Kozak sequences (Ex- 
tended Data Fig. 4b, b’ and Supplementary Table 2). Transcripts with 
high DENR-dependence scores were significantly enriched among the 
mRNAs with reduced translation on DENR knockdown (Extended Data 
Fig. 4c, d), suggesting a general mechanism. We tested ten 5’ UTRs pre- 
dicted to be DENR-dependent using luciferase assays. Six conferred DENR 
dependence (Fig. 3b), and four inhibited reporter translation too strongly 
to test experimentally. Conversely, 16 5’ UTRs without uORFs were not 
DENR-dependent (Fig. 3c and data not shown). Therefore, 5’ UTRs with 
stuORFs are DENR-dependent, identifying a new class of transcripts for 
which translation can be co-regulated. Gene Ontology analysis” revealed 
that these genes are enriched for transcriptional regulators and kinases 
(Extended Data Fig. 4e, f). 

Immunoprecipitation of DENR showed that it binds mRNAs con- 
taining or lacking stuORFs (Extended Data Fig. 4g), suggesting that it 
interacts generally with initiating ribosomes, but is required on stuORF- 
containing mRNAs. Because only 15% of genes contain stuORFs, we were 
surprised to see global effects on polysomes upon DENR knockdown 
(Extended Data Fig. 11). A DENR knockdown time course revealed that 
stuORF-dependent translation drops before changes in polysome or ribo- 
some levels (Extended Data Fig. 5), indicating that these are probably 
secondary consequences. 

Because Insulin and Ecdysone receptors (InR and EcR) contain DENR- 
dependent 5’ UTRs (Fig. 3b), we asked whether impaired InR and EcR 
translation contribute to DENR“° phenotypes. Loss of DENR-MCT-1 
function in $2 cells or DENR*° animals leads to reduced InR and EcR pro- 
teins, but not mRNA levels, and reduced InR and EcR signalling (Fig. 4a—c 
and Extended Data Fig. 6a—c’). Reconstituting InR/EcR expression in 
DENR*® animals partially but significantly rescued developmental rate 
and histoblast proliferation (Fig. 4d, eand Extended Data Fig. 6d). Thus, 
loss of DENR-MCT-1 causes reduced InR/EcR translation and signalling, 
and consequently impaired cell proliferation and organismal development. 
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DENR-dependent regulation (DNA reporters). b, c, 5’ UTRs bearing stuORFs 
are all DENR-dependent (b) whereas 5’ UTRs lacking uORFs (c) are not 
(DNA reporters). Error bars indicate s.d. 


Data from DENR*° flies and DENR knockdown cells suggested that 
proliferating cells are phenotypically more sensitive to DENR loss-of- 
function than quiescent cells. One explanation could be that DENR 
activity is low in quiescent cells, hence its removal has little effect. Using 
mbc and stuORF reporters, and endogenous mbc translation, as read- 
outs for DENR activity revealed that DENR loss had a larger impact in 
proliferating compared to quiescent cells (Fig. 2iand Extended Data Fig. 7). 
Hence DENR-MCT-1 present in quiescent cells is not very active. 

To study DENR function in vivo, we generated flies carrying fluores- 
cent reporters with or without a stuORF (Extended Data Fig. 8a). These 
reporters have identical promoters, 5’ UTRs and 3’ UTRs, and are inte- 
grated in exactly the same genomic locus via phiC31-mediated recom- 
bination, ensuring their identical transcription. This revealed that DENR 
promotes stuORF reporter, but not control reporter, expression in ani- 
mals (Extended Data Fig. 8). Because stuORF-GFP reporter expression 
is entirely DENR-dependent, it serves as an in vivo DENR activity read- 
out. Interestingly, the larval anterior, which contains proliferating tissues 
like brain and imaginal discs, shows stronger DENR activity than other 
larval regions (Extended Data Fig. 8b). Inclusion of an RFP normaliza- 
tion control in trans, analogous to a dual-luciferase assay set-up, revealed 
high DENR activity (stwORF-GFP/normalization-RFP) in proliferating 
tissues (brain and imaginal discs), and low activity in tissues with grow- 
ing, but non-proliferating, cells (salivary gland and fat body, Extended 
Data Fig. 9). 

We wondered how DENR-MCT-1 activity is regulated. Neither DENR 
protein levels nor DENR-MCT-1 binding dropped in quiescent S2 cells 
(Extended Data Fig. 10a—c). Phosphorylation of T82, T125 anda double 
phosphorylation on T118 and $119 in human and fly MCT-1 have been 
observed”. Using cells where endogenous MCT-1 is knocked down via 
its 3’ UTRand then reconstituted with MCT-1 versions lacking the endog- 
enous 3’ UTR revealed that mutations blocking T118/S119 phosphor- 
ylation abolished MCT-1 activity (Extended Data Fig. 10d, d’). Notably, 
T118 and S119 are evolutionarily conserved in humans. Although MCT-1 
was observed to be phosphorylatable in vitro by Erk and Cdc2 (ref. 26), 
we could not observe an effect of Erk, Cdc2, PI(3)K, Akt or TORC1 inhi- 
bition on stuORF reporter expression (Extended Data Fig. 10e-g). Further 
work will be required to identify upstream kinases regulating DENR- 
MCT-1. 
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Figure 4 | Loss of DENR leads to reduced InR and EcR protein levels and 
signalling. a, DENR and MCT-1 knockdown cells have reduced InR and 
EcR protein levels. b, c, DENR knockdown cells are less sensitive to insulin 
stimulation (1h) (b) and to ecdysone (20E, 1 1M, 4h; c).d, e, Expression of EcR 


We have identified a new translational control system regulating an 
abundant class of mRNAs, featuring: (1) stuORFs as the critical cis-element; 
(2) DENR-MCT-1 as the trans-acting factor; and (3) proliferation as 
an important cellular context. This system differs fundamentally from 
GCN4/ATF4 paradigms both mechanistically and functionally. Unlike 
GCN4-type mechanisms'*°?”*°, DENR-MCT-1 functions in non-stressed 
cells, when general translation is not compromised, and independently 
of uORF to main-ORF distance (Fig. 2h), to promote proliferation. Impor- 
tantly, DENR-MCT-1 uncouples translation re-initiation from standard 
initiation, as it is not required for initiation (Fig. 2c (right)). In contrast, 
GCN4-type mechanisms rely on coupling of initiation and re-initiation to 
antagonistically regulate GCN4/ATF4 versus all other genes (Supplemen- 
tary discussion). Our results suggest that re-initiation can be indepen- 
dently controlled via DENR-MCT-1 to modulate translation of a specific 
group of mRNAs. 


METHODS SUMMARY 


DENR knockout flies were generated by homologous recombination using pW25. 
Phospho-dAkt(T342) antibody was developed in collaboration with PhosphoSo- 
lutions. Luciferase assays were performed using a dual-luciferase set-up based on 
pGL3 vectors containing either firefly or Renilla luciferase, and the Drosophila hsp70 
basal promoter. The number of replicates for each experiment are described in Sup- 
plementary Information. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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Histone H4 tail mediates allosteric regulation of 
nucleosome remodelling by linker DNA 


William L. Hwang?**, Sebastian Deindl!**, Bryan T. Harada’? & Xiaowei Zhuang'*"° 


Imitation switch (ISWI)-family remodelling enzymes regulate access 
to genomic DNA by mobilizing nucleosomes’. These ATP-dependent 
chromatin remodellers promote heterochromatin formation and tran- 
scriptional silencing’ by generating regularly spaced nucleosome 
arrays’ °. The nucleosome-spacing activity arises from the dependence 
of nucleosome translocation on the length of extranucleosomal linker 
DNA‘, but the underlying mechanism remains unclear. Here we 
study nucleosome remodelling by human ATP-dependent chromatin 
assembly and remodelling factor (ACF), an ISWI enzyme comprising a 
catalytic subunit, Snf2h, and an accessory subunit, Acf1 (refs 2, 11-13). 
We find that ACF senses linker DNA length through an interplay 
between its accessory and catalytic subunits mediated by the histone 
H4 tail of the nucleosome. Mutation of AutoN, an auto-inhibitory 
domain within Snf2h that bears sequence homology to the H4 tail", 
abolishes the linker-length sensitivity in remodelling. Addition of 
exogenous H4-tail peptide or deletion of the nucleosomal H4 tail 
also diminishes the linker-length sensitivity. Moreover, Acfl binds 
both the H4-tail peptide and DNA in an amino (N)-terminal domain 
dependent manner, and in the ACF-bound nucleosome, lengthen- 
ing the linker DNA reduces the Acf1-H4 tail proximity. Deletion of 
the N-terminal portion of Acf1 (or its homologue in yeast) abolishes 
linker-length sensitivity in remodelling and leads to severe growth 
defects in vivo. Taken together, our results suggest a mechanism for 
nucleosome spacing where linker DNA sensing by Acf1 is allosteri- 
cally transmitted to Snf2h through the H4 tail of the nucleosome. 
For nucleosomes with short linker DNA, Acfl preferentially binds to 
the H4 tail, allowing AutoN to inhibit the ATPase activity of Snf2h. 
As the linker DNA lengthens, Acf1 shifts its binding preference to 
the linker DNA, freeing the H4 tail to compete AutoN off the ATPase 
and thereby activating ACF. 

The packaging of DNA into nucleosomes presents a substantial energy 
barrier that restricts access to the genomic DNA**. ISWI-family remo- 
dellers use the energy from ATP hydrolysis to disrupt histone-DNA 
contacts and reposition nucleosomes’. The catalytic subunits of ISWI 
enzymes possess an SF2-like ATPase that translocates DNA across the 
nucleosome’. The nucleosome translocation activity is further regulated 
by the accessory subunits of ISWI complexes*'”'®. Many ISWI remo- 
dellers exhibit a nucleosome-spacing activity”>. Critical to this spacing 
activity are two features of the nucleosome that modulate the activity of 
ISWI remodellers: (1) the N-terminal tail of histone H4 (refs 8, 17-20) 
and (2) the length of the extranucleosomal linker DNA*"®. The unmod- 
ified H4 tail stimulates ISWI activity by relieving the autoinhibitory 
effect of the AutoN domain within the catalytic subunit'*. H4 tail acet- 
ylation associated with transcriptionally active chromatin is thought 
to help prevent ISWI-induced nucleosome spacing at actively transcribed 
genes'””. Regulation by the extranucleosomal linker DNA is respons- 
ible for generating the regularly spaced nucleosome arrays important 
for heterochromatin formation. Shortening the linker DNA reduces the 
remodelling activity of nucleosome-spacing ISWI enzymes*”°. Asa result, 


nucleosomes are preferentially moved towards longer linkers to pro- 
mote uniform spacing on nucleosome arrays. Interestingly, the cata- 
lytic activity of many ISWI-family enzymes is sensitive to linker DNA 
lengths up to approximately 60-70 base pairs (bp)*°, consistent with 
the inter-nucleosome spacing of heterochromatin observed in human 
cells”’. This linker-length sensing range substantially exceeds the bind- 
ing footprint (20-30 bp) of the catalytic subunit’””’, whereas the accessory 
subunits of ISWI complexes can bind linker DNA as far as ~60 bp from 
the nucleosome edge”. However, it is unknown how accessory subunits 
communicate linker length information to the catalytic subunit to reg- 
ulate remodelling activity. In this work, we investigate the mechanism 
underlying DNA linker-length sensing by a prototypical ISWI-family 
enzyme, human ACF. 

To examine how linker DNA regulates nucleosome translocation by 
ACF, we reconstituted mononucleosomes with varying linker lengths 
(n = 20-78 bp) on the entry side but a constant exit-side linker length of 
3 bp (Fig. 1a). We also constructed mononucleosomes with wild-type 
(WT) histone H4 and two H4 mutants: (1) H4 tail deletion (H4A1-19) 
and (2) H4 with K16A mutation (H4K16A). We refer to nucleosome con- 
structs with the following nomenclature: [WT H4/H4A1-19/H4K16A, 
n bp] for nucleosomes with n bp of DNA on the entry side and an octamer 
containing WT H4, H4A1-19 or H4K16A. We detected ACF-catalysed 
nucleosome translocation using fluorescence resonance energy transfer 
(FRET) by labelling the end of the exit-side linker DNA with the FRET 
acceptor Cy5, and the histone H2A with the FRET donor Cy3 (Fig. 1a)™*. 

We first compared the remodelling kinetics of [WT H4, 78 bp], [WT 
H4, 40 bp], [WT H4, 20 bp] and [H4A1-19, 78 bp] nucleosomes using 
an ensemble FRET assay’. Upon addition of ACF and ATP, the FRET 
efficiency decreased as DNA was translocated towards the exit side 
(Fig. 1b and Extended Data Fig. 1a). As expected, the remodelling rate 
decreased as the linker DNA was shortened and deletion of the H4 tail 
drastically reduced the remodelling activity (Fig. 1b). 

To identify which step(s) of the remodelling process are regulated, 
we monitored the remodelling of individual nucleosomes using single- 
molecule FRET”. Single-nucleosome remodelling traces featured incre- 
mental translocation of DNA to the exit side interrupted by kinetic pauses 
(Fig. 1c). The first pause occurred after ~7 bp of DNA translocation 
and the second pause occurred after an additional ~3 bp of transloca- 
tion (Extended Data Fig. 2a, b), consistent with previous findings**”°. 
Moreover, the step sizes did not change with linker DNA length or his- 
tone H4 modification (Extended Data Fig. 2a, b). We divided the remod- 
elling time trace into two translocation phases (T1, T2), during which 
the FRET efficiency decreased, and two pause phases (P1, P2), during 
which the FRET value remained constant (Fig. 1c). Notably, the DNA 
translocation rates between pauses did not change, whereas the pause- 
phase exit rates decreased dramatically when the linker DNA was short- 
ened (Fig. 1d and Extended Data Fig. 2c). Moreover, the dependence of 
remodelling kinetics on entry-side linker lengths of mononucleosomes 
was quantitatively similar to the dependence on inter-nucleosome linker 


1Howard Hughes Medical Institute, Harvard University, Cambridge, Massachusetts 02138, USA. Graduate Program in Biophysics, Harvard University, Cambridge, Massachusetts 02138, USA. SHarvard/ 
MIT MD-PhD Program, Harvard Medical School, Boston, Massachusetts 02115, USA. *Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, USA. 


5Department of Physics, Harvard University, Cambridge, Massachusetts 02138, USA. 
*These authors contributed equally to this work. 


14 AUGUST 2014 | VOL 512 | NATURE | 213 


©2014 Macmillan Publishers Limited. All rights reserved 


LETTER 


WT ACF 


i) 


= [WTH4, 78 bp] Donor (Cy3) 
a [WTH4, 40 bp] os Acceptor (Cy5) 
2 12 [WT H4, 20 bp] 19 3 12 
Exit side Cy5 2 a [H4A1-19, 78 bp] oF 8 
2 a £ 
4+ACE = oe 4 
oO Fo 
+ATP Cy3 S CEO 
—> 3 1.0) 
N 
= E 
i Ww 0.5 
E rr 
Entry side Biotin § 0.0 
High FRET Low FRET : : 
0 10 20 30 
Time (min) 
d e 
= 30 = 4 30. Wig 71 phase Wig P1 phase 
ra @ 711 phase y |-OF @ P1 phase ak 
3 $12 bnsee a; ‘ © P2 phase ® my 12 phase my P2 phase 
@ 20 con 220 
© z = 2 
s z i i £ 3 0 © 
= 10 § 10 
Q 6 
3 nu 8 
& ie} 3 0.0 7) 0 - 
— =U. =f 
- "40 50 60 70 80 & 40 50 60 70 80 § § WTH4 H4k16A H4A1-19 WT H4 H4kK16A H4A1-19 


Linker DNA length (bp) 


Linker DNA length (bp) 


Figure 1 | The linker DNA length and histone H4 tail regulate the 
remodelling pause phases but not the translocation phases. a, Schematic 
of a FRET-labelled mononucleosome undergoing remodelling by ACF. 

b, Ensemble remodelling time courses of [WT H4, 78 bp], [WT H4, 40 bp], 
[WT H4, 20 bp] and [H4A1-19, 78 bp] nucleosomes by 40nM ACF at 5 uM 
ATP. Nucleosome translocation is monitored by the emission intensity of the 
FRET acceptor Cy5 under excitation of the FRET donor Cy3. c, Cy3 and 
Cy5 fluorescence (top; a.u., arbitrary units) and FRET (bottom) time traces 
during the remodelling of a single [WT H4, 78 bp] nucleosome with the 


lengths of dinucleosomes (Extended Data Fig. 3), validating the use of 
mononucleosomes as a model system to study linker-length sensitiv- 
ity. Interestingly, the H4 tail appeared to regulate the same phase of the 
remodelling process as the linker DNA (Fig. le). The H4K16A muta- 
tion and H4 tail deletion (H4A1-19) decreased the pause-phase exit rate 
by approximately 2- and 20-fold, respectively (Fig. le). In contrast, nei- 
ther modification had any appreciable effect on the translocation rates 
between pauses (Fig. le). 

The above results indicate that both linker DNA and the H4 tail regu- 
late the remodelling rate by changing the duration of pause phases, sug- 
gesting that these nucleosome features may impinge on an inhibitory 
mechanism that prevents the initiation of the DNA translocation phases. 
It has been shown that although the ISWI ATPase domain can trans- 
locate nucleosomes autonomously”, the catalytic subunit contains two 
well-conserved autoregulatory domains, AutoN and NegC, which inhibit 
ATP hydrolysis and its coupling to DNA translocation, respectively"*. 
The AutoN inhibition can be relieved by the H4 tail whereas the NegC 
inhibition can be relieved by binding of the HAND-SANT-SLIDE mod- 
ule to linker DNA™. Could the regulation of remodelling by linker DNA 
length occur through these inhibitory domains? 

To address this question, we first examined the role of the NegC domain. 
Surprisingly, deletion of the NegC domain in the ACF complex (ANegC 
ACF) did not substantially affect the dependence of remodelling kinetics 
on linker DNA lengths ranging from 20 to 78 bp (Fig. 2a—c and Extended 
Data Fig. 4a). Removing the H4 tail dramatically reduced the remodel- 
ling rates of both WT and ANegC ACF (Fig. 2b). These results suggest 
that the NegC domain does not play a substantial role in linker length 
sensing by the ACF complex. In contrast, the isolated Snf2h catalytic 
subunit exhibited a short-range (20-40 bp) linker length sensitivity that 
depended on NegC (Extended Data Fig. 5), in a manner similar to the 
Drosophila ISWI, which lacks any accessory subunit". 

Next, we mutated the AutoN domain with two point substitutions 
(R142A and R144A) in the ACF complex (AutoN-2RA ACF; Fig. 3a and 
Extended Data Fig. 4a). AutoN bears sequence homology to the H4 tail, 
which can compete the inhibitory AutoN domain off the ATPase, and the 
2RA mutation in AutoN is expected to diminish the H4 tail dependence 
of remodelling by ISWI enzymes"*. Remarkably, this mutation not only 
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Histone H4 variant 


translocation (T1, T2) and pause (P1, P2) phases indicated. d, Linker DNA 
length dependence of the translocation rates between pauses (left, defined as the 
average number of base pairs moved per second) and pause-phase exit rates 
(right, defined as the inverse of the average pause durations). e, Dependence of 
the translocation rates between pauses (left) and pause-phase exit rates (right) 
on the H4 variants. In d and e, [ACF] = 10nM and [ATP] = 2 mM. Data 

are mean + s.e.m. derived from at least 100 (d) or at least 50 (e) individual 
nucleosome remodelling traces from three independent experiments. 


increased the remodelling rate of nucleosomes lacking the H4 tail, but 
also completely abolished the linker-length dependence of remodelling 
by specifically increasing the remodelling rate of short-linker nucleo- 
somes (Fig. 3b, c and Extended Data Fig. 6). These results suggest an 
essential role for AutoN in linker length sensing by the ACF complex. 

Since AutoN competes with the H4 tail for binding to the ATPase”™, 
we considered the possibility that this competition is involved in sens- 
ing linker DNA length and hypothesized that the H4 tail is only available 
to compete AutoN off the ATPase when the linker DNA is sufficiently 
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Figure 2 | Deletion of the NegC domain of the Snf2h catalytic subunit does 
not substantially affect linker DNA length sensing by the ACF complex. 

a, Domain architecture of WT and ANegC Snf2h (residues 669-700 replaced 
with a SGSGS linker). b, Ensemble remodelling time courses of [WT H4, 78 bp], 
[WT H4, 40 bp], [WT H4, 20 bp] and [H4A1-19, 78 bp] nucleosomes by 

40 nM WT ACF (black/grey lines, duplicated from Fig. 1b) and ANegC ACF 
(red/pink symbols) at 5 uM ATP. c, Linker DNA length dependence of the 
pause-phase exit rate (P1 phase) measured for WT ACF (black) and ANegC 
ACF (red). [ACF] = 10nM and [ATP] = 20 uM. Data are mean + s.e.m. 
derived from at least 100 individual nucleosome remodelling traces from three 
independent experiments. 
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important for linker DNA length sensing by the ACF complex. a, Domain 
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remodelling time courses of [WT H4, 78 bp], [WT H4, 40 bp] and [H4A1-19, 
78 bp] nucleosomes by 40nM WT ACF (black/grey lines, duplicated 
from Fig. 1b) and AutoN-2RA ACF (blue/cyan symbols) at 5 uM ATP. 

c, Dependence of the pause-phase exit rate on the linker DNA length and H4 
tail for WT (black) or AutoN-2RA ACF (blue/cyan). *Too slow to be measured. 
d, Effect of the exogenously added H4-tail peptide on the pause-phase exit 
rates during remodelling by WT ACF. e, Pause-phase exit rates of nucleosomes 
lacking the H4 tail during remodelling by WT ACF. In c-f, [ACF] = 10nM 
and [ATP] = 20 uM, except that 2mM of ATP was used in e to make the 
pause exit rates measurable for nucleosomes lacking the H4 tail. Data are 
mean + s.e.m. from at least 100 (c, d) or at least 50 (e) individual nucleosome 
remodelling traces from three independent experiments. 


long. Consistent with this hypothesis, adding exogenous H4-tail pep- 
tide, which should help compete AutoN off the ATPase when the nucle- 
osomal H4 tail is unavailable, specifically increased the remodelling rate 
of short-linker nucleosome ([WT H4, 40 bp]) by WT ACF (Fig. 3d). Fur- 
thermore, deletion of the nucleosomal H4 tail, in addition to slowing 
down remodelling, abolished the dependence of remodelling rate on 
linker DNA length (Fig. 3e). These results indicate that the H4 tail is 
indeed involved in linker DNA sensing. 

Because the catalytic subunits of ISWI-family enzymes only interact 
with ~20-30 bp of extranucleosomal DNA, the linker-length sensitiv- 
ity of ACF cannot be accounted for by the catalytic subunit alone. Our 
findings raise the intriguing possibility ofa linker-length sensing mech- 
anism where the accessory subunit Acfl interacts with the H4 tail ina 
linker-length-dependent manner, which modulates the H4 tail avail- 
ability for competing with AutoN. To test this possibility, we generated 
two Acfl mutants, AC-term Acfl and AN-term Acf1, in which 134 resi- 
dues at the carboxy (C) terminus or 371 residues at the N terminus were 
deleted, respectively (Extended Data Fig. 7a and Fig. 4a). Because the 
central region of Acfl required for Snf2h binding’*”* was not deleted, 
both mutants were able to form complexes with Snf2h, which are referred 
to as AC-term and AN-term ACF (Extended Data Fig. 4b). 

We first probed which region of Acfl interacts with the H4 tail by 
comparing the binding affinities of WT, AC-term and AN-term Acfl for 
the H4-tail peptide using a fluorescence anisotropy assay. Interestingly, 
WT Acfl exhibited specific, nanomolar affinity for the H4-tail peptide 
(Fig. 4b and Extended Data Fig. 7b) that was not substantially altered 
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upon deletion of the C-terminal region (Extended Data Fig. 7c), but 
was completely lost upon deletion of the N-terminal portion (Fig. 4b). 
These results indicate that Acfl interacts with the H4 tail probably through 
its N-terminal region. Acfl also bound double-stranded DNA and dele- 
tion of the N-terminal region abolished this interaction too (Extended 
Data Fig. 8), consistent with the previous finding that the WAC motif 
within the N-terminal region is important for binding of ACF to the 
linker DNA*’. Given the distinct properties of DNA and the H4 tail, 
their specific binding interfaces within Acfl N-term are probably distinct. 

Next, we investigated nucleosome remodelling by the AC-term and 
AN-term ACF complexes. Notably, the AC-term mutation did not sub- 
stantially alter the dependence of remodelling kinetics on linker DNA 
length (Extended Data Fig. 7d), whereas the linker-length sensitivity 
was eliminated in the AN-term ACF complex (Fig. 4c, d). This finding is 
consistent with the specific affinity of Acfl N-term for the H4 tail (Fig. 4b). 
Furthermore, if the loss of linker-length sensitivity was simply a result 
of losing the linker DNA binding affinity of Acfl, AN-term ACF should 
demonstrate inefficient remodelling for all linker DNA lengths. Instead, 
AN-term ACF remodelled both short- and long-linker nucleosomes at 
rates close to the rate with which WT ACF remodelled long-linker nucle- 
osomes (Fig. 4c, d), suggesting that deletion of Acfl N-term disabled a 
mechanism that inhibits remodelling at short linker lengths. AN-term 
ACE also maintained the H4-tail requirement in remodelling (Fig. 4c). 

Since Acf1 has affinity to both DNA and the H4 tail, a plausible inter- 
pretation of the above observations is that the nucleosomal linker DNA 
and H4 tail are in competition for binding to the N-terminal region of 
Acfl and that this competition is modulated by the length of the linker 
DNA. Only when the linker is sufficiently short does Acf1 preferentially 
bind to the H4 tail, making it unavailable to compete with the inhibitory 
AutoN. Deletion of Acfl N-term diminishes the Acfl-H4 tail interaction 
such that the H4 tail is equally available to activate the ATPase at both 
short and long linker DNA lengths. We therefore probed the linker-length 
dependence of the Acf1-H4 tail proximity in ACF-bound nucleosomes 
featuring a cysteine-reactive crosslinker on the H4 tail. Specific H4-Acfl 
crosslinking product was clearly observed as a band with reduced elec- 
trophoretic mobility compared with non-crosslinked Acfl (Fig. 4e and 
Extended Data Fig. 9). Remarkably, the Acfl-H4 crosslinking efficiency 
decreased substantially with increasing linker DNA length (Fig. 4e), sup- 
porting our hypothesis that the Acfl-H4 tail interaction is modulated 
by the linker DNA length. In contrast, the H4-Snf2h crosslinking effi- 
ciency did not change substantially with linker DNA length, probably 
because Snf2h remains sufficiently close to the H4 tail regardless of the 
linker DNA length, which allows crosslinking even when the H4 tail was 
not specifically bound to its binding pocket on Snf2h. 

Finally, we tested the physiological importance of the N-terminal 
region of Acfl by studying the role of its homologue in yeast’. Yeast 
ISW2 is functionally similar to ACF. It is composed of a catalytic sub- 
unit (Isw2) that is homologous to Snf2h and three accessory subunits 
(Itcl, Dpb4 and Dls1), among which Itcl is homologous to Acfl. We 
generated three mutant yeast strains: (1) deletion of the entire itc1 gene 
(Aitc1), (2) deletion of only the portion of itc1 that encodes the N-terminal 
region of Itcl equivalent to Acfl N-term (Aitc1-Nterm) and (3) a rescue 
strain that was derived from the Aitc1-Nterm strain by deleting the remain- 
der of itc1 (rescue-Aitc1). Both Aitc1 and rescue-Aitc1 showed growth 
rates similar to that of the WT strain (Fig. 4f), consistent with previous 
observations”’. In contrast, the Aitcl-Nterm strain displayed dramat- 
ically slower growth (Fig. 4f), consistent with an aberrant chromatin- 
misregulation phenotype. 

Taken together, our results suggest a nucleosome-spacing mechanism 
for ACF in which the linker DNA length is sensed by the Acfl accessory 
subunit and allosterically transmitted to the Snf2h catalytic subunit through 
the H4 tail of the nucleosome (Fig. 4g). Acfl and the AutoN domain of 
Snf2h function collectively in DNA linker-length sensing. When the 
linker DNA is short, Acfl preferentially binds to and sequesters the H4 
tail, making it unavailable to compete its sequence homologue, AutoN, 
off the ATPase. Hence, the ATPase activity is inhibited by AutoN. As 
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Figure 4 | The N-terminal region of the Acfl accessory subunit is important 
for linker DNA length sensing by the ACF complex. a, Domain architecture 
of WT and AN-term (residues 1-371 deleted) Acfl. b, Fluorescence anisotropy 
of dye-labelled H4-tail peptide in the presence of varying amounts of WT 
(black symbols) or AN-term (green symbols) Acfl. Data are mean + s.e.m. 
(n = 3 independent experiments). The dissociation constant (Kq) for WT Acfl 
is 3 + 9nM (error bars, 95% confidence intervals). c, Ensemble remodelling 
time courses of [WT H4, 78 bp], [WT H4, 40 bp] and [H4A1-19, 78 bp] 
nucleosomes by 40 nM WT ACF (black/grey lines, duplicated from Fig. 1b) and 
AN-term ACF (green/light green symbols) at 5 1M ATP. d, Dependence of 
the pause-phase exit rate on the linker DNA length for WT ACF (black) or 
AN-term ACF (green). [ACF] = 10nM and [ATP] = 20 uM. Data are 

mean + s.e.m. derived from at least 100 individual nucleosome remodelling 
traces from three independent experiments. e, Crosslinking of the H4 tail to 


the linker DNA length increases, Acfl shifts its binding preference to 
the linker DNA and releases the H4 tail, allowing it to compete AutoN 
off the ATPase and activate ACF. This competition between the H4 tail 
and linker DNA for Acfl binding probably involves the N-terminal 
region of Acf1. It is interesting to note that linker DNA sensing occurs 
during the pause phases when the ATPase domain is not actively trans- 
locating DNA, suggesting that AutoN engages the ATPase domain dur- 
ing the pauses. To exit the pauses, the H4 tail is required to relieve the 
inhibitory effect of AutoN. The re-engagement of AutoN with the ATPase 
domain after each translocation phase would give ACF an opportunity 
to periodically sense the linker DNA length. Such frequent sensing may 
allow a more efficient nucleosome spacing, as previously hypothesized”. 
The linker DNA and the H4 tail are two important substrate features 
that regulate nucleosome remodelling by ISWI-family enzymes, the 
former enabling uniform nucleosome spacing for heterochromatin for- 
mation and the latter specifying regions of chromatin for silencing. Our 
results now reveal an unexpected convergence of the regulatory path- 
ways defined by these two distinct nucleosome features. 


METHODS SUMMARY 

Detailed descriptions of nucleosome and ACF preparation, as well as single-molecule 
and ensemble FRET, fluorescence anisotropy, protein crosslinking and yeast experi- 
ments, are described in Methods. Briefly, various nucleosome constructs were recon- 
stituted using Cy3-labelled histone octamers and Cy5-labelled DNA with a biotin 
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Acfl depends on the linker DNA length. The crosslinking products were 
analysed by SDS-PAGE (left). The Acfl-H4 crosslinking band was absent 
for ACF without nucleosomes (lane ‘ACF—nucleosomes’). Right: 
quantification of the H4-crosslinked fractions of Acfl and Snf2h as a function 
of linker DNA length. Data are mean + s.e.m. (n = 3 independent crosslinking 
experiments). f, Effects of deletion of Itcl (Acfl homologue) and its N-terminal 
region on the growth of yeast cells. Top row: WT. Second row: the itcl gene 
is deleted (Aitc1). Third row: the coding sequence of the N-terminal region of 
Itcl is deleted (Aitcl-Nterm). Bottom row: the remaining portion of itc1 is 
deleted from Aitc1-Nterm (rescue-Aitc1). One representative of three 
independent growth experiments is shown. g, Model for linker DNA length 
sensing by the ACF complex. DNA: grey lines; histone octamer: beige cylinders; 
Snf2h: blue/cyan; Acfl: green. The ATPase domain of Snf2h is depicted as 

a cyan sphere and labelled ‘On’ when active and ‘Off when inactive. 


moiety for surface anchoring. DNA was generated by PCR or by annealing and 
ligating a set of overlapping, complementary oligonucleotides (Extended Data Fig. 10). 
Histone octamer, nucleosomes, Acfl, Snf2h and ACF complexes were reconstituted 
and purified as described previously*”**. Mutant yeast strains were generated in the 
BY4741 background. Single-molecule FRET measurements were performed with a 
custom-built microscope setup. Ensemble FRET used a Cary Eclipse fluorescence 
spectrophotometer. Fluorescence anisotropy measurements used a SpectraMax micro- 
plate reader. Crosslinking experiments used nucleosomes with a cysteine-reactive 
crosslinker BM(PEG); at the H4 tail N terminus and reaction products were ana- 
lysed by SDS-polyacrylamide gel electrophoresis (SDS-PAGE). 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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G-protein-coupled receptors (GPCRs) are critically regulated by B- 
arrestins, which not only desensitize G-protein signalling but also 
initiate a G-protein-independent wave of signalling’. A recent surge 
of structural data on a number of GPCRs, including the B, adrenergic 
receptor (B,AR)-G-protein complex, has provided novel insights into 
the structural basis of receptor activation® . However, complement- 
ary information has been lacking on the recruitment of B-arrestins to 
activated GPCRs, primarily owing to challenges in obtaining stable 
receptor-f-arrestin complexes for structural studies. Here we devised 
a strategy for forming and purifying a functional human B,AR-f- 
arrestin-1 complex that allowed us to visualize its architecture by 
single-particle negative-stain electron microscopy and to character- 
ize the interactions between B,AR and B-arrestin 1 using hydrogen- 
deuterium exchange mass spectrometry (HDX-MS) and chemical 
crosslinking. Electron microscopy two-dimensional averages and three- 
dimensional reconstructions reveal bimodal binding of B-arrestin 1 
to the BAR, involving two separate sets of interactions, one with 
the phosphorylated carboxy terminus of the receptor and the other 
with its seven-transmembrane core. Areas of reduced HDX together 
with identification of crosslinked residues suggest engagement of the 
finger loop of B-arrestin 1 with the seven-transmembrane core of the 
receptor. In contrast, focal areas of raised HDX levels indicate 
regions of increased dynamics in both the N and C domains of B- 
arrestin 1 when coupled to the B, AR. A molecular model of the 8B, AR- 
B-arrestin signalling complex was made by docking activated B-arrestin 
1 and BAR crystal structures into the electron microscopy map den- 
sities with constraints provided by HDX-MS and crosslinking, al- 
lowing us to obtain valuable insights into the overall architecture ofa 
receptor-arrestin complex. The dynamic and structural information 
presented here provides a framework for better understanding the 
basis of GPCR regulation by arrestins. 

To facilitate the isolation of a stable 8, AR-B-arrestin complex, we 
used a modified BAR construct with its C terminus replaced by that of 
the arginine vasopressin type 2 receptor (AVPR;). This chimaeric re- 
ceptor (§,V,R) maintains pharmacological properties identical to the 
B2AR, but it binds B-arrestins with higher affinity compared to wild-type 
BAR. We co-expressed B,V>R, B-arrestin 1 (1-393) and GRK2“** 
(GRK2 with a membrane-tethering prenylation signal) in insect cells 
followed by agonist stimulation and affinity purification through the 
Flag-tagged receptor (Fig. 1a) However, since the isolation of a stable 
complex was still not feasible (Fig. 1b, lanes 1 and 2), we explored en- 
hancing its stability by adding Fab30, an antibody fragment we previously 


reported that selectively recognizes and stabilizes the active confor- 
mation of B-arrestin 1 (ref. 13). Indeed, incubation of Fab30 with pre- 
formed complex in the membrane resulted in a robust purification of 
the B, V,R-B-arrestin-1 complex (Fig. 1b, lanes 5 and 6), whereas a non- 
specific Fab (referred to as Fab1) did not support complex stabilization 
(Fig. 1b, lanes 3 and 4). Complex isolation was only possible in res- 
ponse to an agonist (BI-167107) and not an inverse agonist (ICI- 
118551) (Fig. 1b, lanes 5 and 6). Furthermore, the efficiency of complex 
purification using this approach directly mirrors the pharmacological 
efficacy of the ligand used to stimulate the cells (Fig. 1c). While stimulation 
of cells with inverse agonists does not yield detectable co-purification 
of B-arrestin 1, agonists robustly stabilize the complex and partial ago- 
nists yield co-purification of B-arrestin 1 at moderate levels. Moreover, 
the efficiency of complex formation also corresponds to the ligand occu- 
pancy of the receptor as reflected by the increasing amount of B-arrestin 1 
co-purification with increasing agonist concentrations (Extended Data 
Fig. la, b). The direct correlation of ligand efficacy and occupancy with 
purification efficiency reflects the fact that this approach yields a com- 
plex that depends on both activated receptor conformation and recep- 
tor phosphorylation. The purified B.V2R-B-arrestin-1-Fab30 complex 
also exhibited a robust interaction with the purified clathrin terminal 
domain compared to B-arrestin 1 alone, suggesting that B-arrestin 1 in 
this complex is in a physiologically relevant and functional conforma- 
tion (Extended Data Fig. 2)'*"'*. Importantly, this strategy allowed pre- 
parative scale purification of a highly stable i. V.R-B-arrestin-1-Fab30 
complex as assessed by analytical size exclusion chromatography (Fig. 1a, 
bottom right, green trace, and Extended Data Fig. 1c). In addition to 
the Fab30-stabilized 8, V2R-B-arrestin-1 complex, we were also able to 
obtain equally stable 8, V.R-B-arrestin-1 complexes using the single- 
chain variable fragment of Fab30 (ScFv30) (Fig. 1a, bottom right, blue 
trace). 

The interaction of f-arrestins with activated GPCRs is proposed to 
involve two sequential steps’”. First, the phosphorylated C terminus of 
activated GPCRs is thought to engage the N domain of f-arrestins, a 
high-affinity charge-charge interaction primarily mediated between the 
phosphates on the receptor tail and basic residues on B-arrestins’*””. 
This first engagement is hypothesized to facilitate activating conforma- 
tional changes in B-arrestin, leading in turn to additional interactions 
with the transmembrane core of the receptor’’. To obtain dynamic struc- 
tural information on the receptor—B-arrestin complex, we carried out 
HDX-MS analysis on the purified assembly'*”’. In addition to the 
B2V2R-B-arrestin-1-Fab30 complex, we used the AVPR, C-terminal 
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Figure 1 | Formation and functional characterization of a stable agonist- 
B.V2R-f-arrestin-1 signalling complex. a, Schematic flowchart of a novel 
purification strategy to isolate 8, V.R-f-arrestin-1-Fab30 complex and large- 
scale production and separation of agonist-}2 V2R-f-arrestin- 1-Fab30/ScFv30 
complex from the free receptor by size exclusion chromatography (Superdex 
200, 16/600 prep grade). The T4L domain is attached at the N terminus of the 
B.AR. Barr, B-arrestin. Cy, constant domain of heavy chain; C;, constant 
domain of light chain; Vy, variable domain of heavy chain; V_, variable domain 


phosphopeptide (V.Rpp)- B-arrestin-1—-Fab30 complex as a reference 
to extract specific information about the core interaction between the 
receptor and (-arrestin 1. 

We observed a reduction in the HDX rate in the three major loops— 
the finger loop (55%), the middle loop (16%) and the lariat loop 
(23%)—of B-arrestin 1 when we compared the HDX-MS profile of the 
B.V,R-B-arrestin-1—Fab30 complex with that of the V,Rpp-f-arrestin- 
1-Fab30 complex (Fig. 2 and Extended Data Fig. 3a). Thus, these regions, 
and especially the finger loop, are likely to be buried (or have reduced 
solvent exposure) in the B. V,R-f-arrestin- 1-Fab30 complex, probably 
through an intricate engagement with the transmembrane receptor core. 
This finding is consistent with previous electron paramagnetic reso- 
nance (EPR) studies on rhodopsin-arrestin interactions, which revealed 
a crucial involvement of the finger loop of arrestin with the core of 
rhodopsin’”*® ~*, Interestingly, several regions in both the N and the C 
domains of f-arrestin 1, in contrast, reveal enhanced HDX rates, indi- 
cating that they become more dynamic upon interaction of B-arrestin 
with the agonist-bound phosphorylated receptor. This observation 
suggests that the core interaction between f-arrestin 1 and B.V2R pro- 
bably has long-range effects on B-arrestin 1 structure. Previous stud- 
ies mapping interactions between GPCRs and arrestins suggested that 
receptors may also interact with the broad concave surfaces of the N 
and C domains of arrestins”’”***. However, peptides representing these 
surfaces are not fully represented in our HDX-MS studies, thus limiting 
our ability to detect these interactions. We also note that our previously 
published high-affinity agonist radioligand binding data on the T4 ly- 
sozyme (T4L)-B,V2R-B-arrestin-1—Fab30 complex in membranes, which 
provides a readout of the fully engaged B-arrestin conformation, sug- 
gested that approximately 32% of the receptor is in a high-affinity 
agonist binding state’*. This indicates that our HDX-MS data repres- 
ent an average of two mixed complex populations, one with fully engaged 
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of light chain. b, Isolation of ByV2R-f-arrestin-1 complex requires Fab30 
and is agonist dependent. Cells were stimulated either with inverse agonist 
(ICI-118551) or agonist (BI-167107) followed by incubation with or without 
Fab and subsequent purification on Flag M1 beads. CTL, control; IP, 
immunoprecipitation; WB, western blot. c, Formation of B,V2R-f-arrestin-1- 
Fab30 complex follows ligand efficacy. Formation of the complex in response 
to inverse agonists, partial agonists and full agonists is shown. The data are 
representative of seven independent experiments. Error bars, s.e.m. 


B-arrestin 1 with the receptor and the other displaying partially engaged 
B-arrestin 1. 

Our previous crystal structure of V,Rpp bound to activated p- 
arrestin 1 revealed a marked repositioning of the finger loop compared 
to when it is bound to the inactive B-arrestin 1, presumably because it 
is primed to engage with the transmembrane core of the activated 
receptor’. To test this we carried out MS-based mapping of the T4L- 
B.VR-f-arrestin- 1 interface using the homobifunctional, primary amine 
reactive chemical crosslinker disuccinimidyl adipate (DSA). We found 
that Lys 77 on B-arrestin 1 (towards the distal end of the finger loop) 
crosslinks with Lys 235 in the third intracellular loop of the B,AR (Ex- 
tended Data Fig. 3b-e). These findings are in line with previously pub- 
lished biochemical and biophysical data suggesting an intricate interaction 
of the receptor core and the finger loop in arrestins. As an additional 
control for the close proximity of these residues, we created a series of 
mutants with single cysteine substitutions around Lys 235 in the N- 
terminal end of the third intracellular loop of the B2V2R (amino acids 
231-236) and in the finger loop around Lys 77 of B-arrestin 1 (amino 
acids 75-79) and evaluated the formation of disulphide-trapped com- 
plexes in pairs of receptor and B-arrestin-1 mutants. Consistent with 
our chemical crosslinking data, cysteines engineered at position 235 of 
the receptor and at position 78 in B-arrestin 1 yielded the most robust 
disulphide-trapped complex, suggesting a close proximity of these two 
residues in the complex (Extended Data Fig. 4). Taken together these 
findings demonstrate a direct interaction of the finger loop with the 
receptor core. 

We next employed single-particle electron microscopy (EM) to exam- 
ine the architecture and conformational dynamics of 8, V2R-f-arrestin-1 
complexes. Owing to the asymmetric nature and small size of these com- 
plexes (~150 kilodalton (kDa) and ~125kDa for the Fab and ScFv 
complexes, respectively) characterization attempts with cryo-EM were 
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Figure 2 | HDX-MS analysis reveals potential interface between B,V2R and 
B-arrestin 1. a—c, Differential HDX rates of B-arrestin 1 in the B2V2R-B- 
arrestin-1—Fab30 versus V2Rpp-f-arrestin-1-Fab30 complexes were mapped 
onto the B-arrestin-1 crystal structure (Protein Data Bank (PDB) accession 
4JQI). Blue and red colour coding indicate the B-arrestin-1 regions that 
exchange slower and faster, respectively, in the B.V2R-B-arrestin-1—Fab30 
complex when compared to the V,Rpp-f-arrestin-1-Fab30 complex. Boxed 
regions with significant HDX rate changes are enlarged in a-c. The HDX rates 
of the finger loop (residues 63-75) (a), middle loop (residues 129-140) and 
lariat loop (residues 274-300) (b) became slower, whereas those of other 
regions, for example, B-strand I, II and X in the N domain (c) became 

faster in the 8. V,R-f-arrestin-1-Fab30 complex when compared to the 
V.Rpp-f-arrestin-1-Fab30 complex. 


not successful and we thus applied negative-stain EM, which provides 
adequate contrast for alignment of small particle projections. This ap- 
proach also enabled a direct comparison with our earlier negative- 
stain EM analysis of the 8, AR-Gas protein complex’. As in that work, 
here we used a T4L fusion at the N terminus of the receptor (referred to 
as T4L-B,V>2R) to provide a marker for the receptor orientation’. The 
negative-stain EM visualization showed a monodisperse particle pop- 
ulation (Fig. 3a and Extended Data Fig. 5) and we applied reference- 
free alignment and classification to obtain two-dimensional averages 
of the complex. 

The majority of averages of the B. V2R-f-arrestin- 1-Fab30 complex 
revealed distinct projection profiles of an ovoid density, attributed to 
the receptor in partially flattened detergent micelle, with an attached 
T-like density attributed to the Fab30-B-arrestin-1 complex (Fig. 3b and 
Extended Data Fig. 6a). Comparisons with averages of the B.V2R-B- 
arrestin-1-ScFv30 complex identify the Fab30 density engaging the 
middle of B-arrestin 1, in agreement with our recent crystal structure 
of B-arrestin-1-Fab30 co-crystallized with the V.Rpp (Fig. 3b and Ex- 
tended Data Fig. 6b). In this conformation f-arrestin 1 appears to hang 
off the receptor via a single point interaction presumably involving only 
the flexible V.Rpp fused on BAR. The flexible nature of this interaction 
is further supported by the variable receptor orientation in these aver- 
ages, as judged by the T4L domain positioning. It is possible that this 
‘hanging’ arrestin conformation based on the V,Rpp-f-arrestin- 1 inter- 
action represents a transient intermediate step in the recruitment pro- 
cess that has been stabilized by Fab30. Strikingly, we also observe a 
substantial number of class averages, representing ~37% of particles, 
in which B-arrestin 1 forms a much more extensive interface with the 
receptor, employing roughly the opposite face of the Fab30 binding 
region (Fig. 3b, bottom). The observed fraction of particles displaying 
the extensive interface is in agreement with our previous radioligand 
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binding results on the T4L-B. V2R-B-arrestin- 1-Fab30 complex in mem- 
branes, which suggested that approximately 32% of the receptor is ina 
high-affinity agonist binding state’’. This observation also raised the 
possibility that B-arrestin 1 fully engages the receptor through a second 
set of weak interactions. 

To stabilize this weak interaction, we developed an approach whereby 
the B. V.R-f-arrestin- 1-Fab30/ScFv30 complex is crosslinked by expo- 
sure to a glutaraldehyde-containing buffer zone while migrating through 
a size exclusion column (Extended Data Fig. 7a). This method facilitated 
near complete crosslinking of preformed complexes at relatively high 
concentrations and simultaneously enabled the isolation of highly mono- 
disperse sample (Extended Data Figs 7b, c, 8, 9). 

EM classification and averaging of the crosslinked 8, V2R-f-arrestin- 
1-Fab30/ScFv30 complexes revealed distinct views of a uniform particle 
architecture, suggesting that crosslinking stabilized a single complex 
conformer (Fig. 3c). More importantly, the averages show that arrestin 
interacts extensively with the receptor in a configuration that appears 
very similar to the one observed in the smaller fraction (~37%) of the 
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Figure 3 | Single-particle EM analysis of the B,V.R-f-arrestin-1-Fab30/ 
ScFv30 complex. a, Representative raw EM image of negative-stained 
T4L-B.V.R-B-arrestin-1-Fab30/ScFv30 complexes. Barr, B-arrestin. Scale 
bar, 25 nm. b, Representative class averages of the native T4L-B,V2R-B- 
arrestin-1—-Fab30/ScFv30 complex. Class averages of particles displaying the 
loose ‘hanging’ interaction (top) and the fully engaged ‘tight’ interaction 
(bottom) are presented. m, LMNG detergent micelle. Scale bar, 10 nm. 

c, Representative class averages of the ‘on-column’ crosslinked T4L-B,V2R-f- 
arrestin-1-Fab30/ScFv30 complex. Upon crosslinking, the majority of class 
averages display the tight (fully engaged) B-arrestin-1 conformation, similar to 
a fraction (~37%) of particles observed in the non-crosslinked complex. 
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native complex. The conformational stabilizing action of the cross- 
linking is also evidenced by the consistent position of the T4L projec- 
tion profile, in contrast to the variable positioning observed in averages 
of the native complex. To better characterize the 8. V,R-B-arrestin-1 as- 
sembly, we employed the random conical-tilt approach” to calculate low- 
resolution three-dimensional maps (~29 A) from selected classes of the 
crosslinked complex (Extended Data Fig. 10). The three-dimensional re- 
constructions show distinct densities for the main complex components, 
in full agreement with our domain assignment in the two-dimensional 
projections averages (Fig. 4a and Extended Data Fig. 10). The receptor- 
containing region appears ovoid due to the large micelle ‘belt’ char- 
acteristic of the lauryl maltose neopentyl glycol (LMNG) detergent, as 
we also observed in the case of the 8, AR-Gas complex’. A protrusion 
on one end of the receptor-micelle globular density represents the T4L 
domain that marks the receptor extracellular region. On the opposite 
side, the B-arrestin 1 density lies longitudinally on the receptor, enga- 
ging roughly the opposite side of the Fab30-interacting region. In this 
configuration, both B-arrestin domains appear to engage the receptor 
but one of the domains lies mostly outside the interacting zone. 

The HDX-MS, chemical crosslinking and disulphide trapping data 
allowed us to constrain the modelling of the T4L-BAR and f-arrestin-1- 
Fab30 crystal structures within the density of the EM three-dimensional 
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maps and generate a low-resolution model for the overall conformation 
of the B, AR-B-arrestin-1 complex (Fig. 4b). This model can accommo- 
date limited rotations and translations of the individual crystal struc- 
tures, which are also expected to undergo conformational changes 
upon complex formation. Lys 77 of B-arrestin 1 in our model is placed 
in close proximity to B,AR Lys 235, which is located at the end of a 
helical extension of transmembrane (TM)5 in the B,AR-Ga, complex"®. 
This prompted us to use this structure to model the B,AR-f-arrestin-1 
complex. In our model, B-arrestin 1 forms an extensive interface with 
the receptor through its N-terminal domain, which includes interac- 
tions with the phosphorylated receptor tail and the insertion of the 
finger loop directly in the receptor core, involving the space between 
TM3, 5 and 6. We note that the finger loop insertion is probably asso- 
ciated with outward shifts in the positioning of TM helices 3, 5 and 6 
and also helix 8. The middle and lariat loops of B-arrestin 1 do not par- 
ticipate in major interactions but reside close to the interface, as sug- 
gested by the modest reduction in their HDX rates observed by HDX-MS 
(Fig. 2c). The relative positioning of these loops is also in agreement with 
previous EPR studies on visual arrestin in complex with activated and 
phosphorylated rhodopsin”*”’. 

In regards to B2AR, TM5 and the third intracellular loop in this model 
locate above the concave f-sheet region of the N-terminal domain of 


Figure 4 | Structural model of the B,V2R-B- 
arrestin-1-Fab30 complex. a, Views of the 
T4L-B,V.R-B-arrestin-1-Fab30 complex three- 
dimensional reconstruction with modelled 
T4L-B,AR (green-orange; PDB accession 3SN6), 
B-arrestin-1 (blue; PDB accession 4JQI), and Fab30 
(purple; PDB accession 4JQI) crystal structures. 
The density surrounding B,V>R represents the 
LMNG detergent micelle and is marked by ‘m’. 
Barr, B-arrestin. Scale bar, 5nm. b, Views of the 
B.V.R-f-arrestin-1 interface within the dashed 
line square of a. Areas of B-arrestin 1 with reduced 
HDX are shown in cyan. Crosslinked Lys 235 of 
B2V>R and Lys 77 of B-arrestin 1 are highlighted. 
c, Illustration of the two-step GPCR-f-arrestin-1 
interaction using surface representations of the 
structures of BAR (orange), the phosphorylated 
C-terminal tail of VR (yellow) and f-arrestin 1 
(blue). The C-terminal portion of the V.R 
peptide (Glu 355-Asp 367) in the right model is 
positioned as found in the B-arrestin-1-Fab30- 
V.Rpp structure (PDB accession 4JQI), whereas 
the N-terminal portion (Ala 342—Pro 352) was 
remodelled to connect to the B,AR C terminus. 
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B-arrestin 1. The placement of these receptor elements implies that the 
N terminus of V2Rpp cannot be in the position observed in the crystal 
structure of V,Rpp-f-arrestin-1-Fab30 (ref. 13), suggesting that the 
V.R C terminus in the B.V2R chimaeric receptor is mobile and repo- 
sitions itself markedly upon f-arrestin-1 interaction with the receptor 
core. In contrast to the N-terminal domain, the C-terminal domain of 
B-arrestin 1 lies mostly outside the interaction zone, apart from the 
loop of residues 242-246 that is at interacting distance from the short 
a-helical segment connecting TM3 and TM4 of B2V>R. This observation 
is intriguing considering that mutation of the residues distal to the 
DRY motif (at the end of TM3) have been reported to directly affect 
B-arrestin recruitment for a number of GPCRs including the B2AR”. 

Our results suggest that arrestin probably employs a biphasic mech- 
anism to engage the receptor (Fig. 4c). The first phase involves an in- 
teraction between the phosphorylated C-terminal tail of the receptor 
and the N-terminal domain of arrestin. Given the flexibility and the 
length of the C-terminal receptor tail, it is expected to act like a fishing 
line, sampling a wide interaction space at a high rate. The second point 
of interaction appears weak and involves primarily the insertion of the 
finger loop within the receptor core, resulting in a longitudinal arrange- 
ment of arrestin on the receptor (Fig. 4a, c). This arrangement would 
most certainly preclude GPCR engagement of G-protein heterotri- 
mers, thereby blocking classical GPCR signalling and inducing desens- 
itization. While it is not yet clear whether the single point interaction 
resulting in a hanging arrestin configuration has other physiological 
functions, it seems possible that these might involve recruitment and 
complex formation with components of the receptor endocytosis and 
signalling machinery such as clathrin and Gy. 


METHODS SUMMARY 


BoVoR, B-arrestin 1 and GRK2“*** were co-expressed in Sf9 cells. Sixty-six hours 
post-infection, cells were stimulated with the high-affinity agonist BI-167107 for 
30 min at 37 °C. Cells were harvested and lysed by douncing, followed by incubation 
with purified Fab30. One hour post-incubation, cells were solubilized and purified 
on a Flag M1 affinity column followed by size exclusion chromatography. The 
purified complex was subjected to HDX-MS analysis by incubating it with D.O for 
various time points followed by pepsin digestion and liquid chromatography (LC)/ 
MS-based identification of peptides. Purified T4L-B.V2R-f-arrestin-1-Fab30/ScFv30 
complex was embedded in negative stain and visualized by EM. EM two-dimensional 
averages of the complexes were obtained by ISAC” and three-dimensional recon- 
structions were obtained through the random conical-tilt method”. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 13 January; accepted 30 April 2014. 
Published online 22 June 2014. 


1. Pierce, K.L. & Lefkowitz, R. J. Classical and new roles of B-arrestins in the regulation 
of G-protein-coupled receptors. Nature Rev. Neurosci. 2, 727-733 (2001). 

2. Shukla, A. K., Xiao, K. & Lefkowitz, R. J. Emerging paradigms of B-arrestin- 
dependent seven transmembrane receptor signaling. Trends Biochem. Sci. 36, 
457-469 (2011). 

3. Lefkowitz, R. J. & Shenoy, S. K. Transduction of receptor signals by B-arrestins. 
Science 308, 512-517 (2005). 

4. Pierce, K. L., Premont, R. T. & Lefkowitz, R. J. Seven-transmembrane receptors. 
Nature Rev. Mol. Cell Biol. 3, 639-650 (2002). 

5. DeWire,S.M., Ahn, S., Lefkowitz, R. J. & Shenoy, S. K. B-Arrestins and cell signaling. 
Annu. Rev. Physiol. 69, 483-510 (2007). 

6. Rasmussen, S. G. et a/. Crystal structure of the Bo adrenergic receptor-Gs protein 
complex. Nature 477, 549-555 (2011). 

7. Weis, W. |. & Kobilka, B. K. Structural insights into G-protein-coupled receptor 
activation. Curr. Opin. Struct. Biol. 18, 734-740 (2008). 

8. Rosenbaum, D. M., Rasmussen, S. G. & Kobilka, B. K. The structure and function of 
G-protein-coupled receptors. Nature 459, 356-363 (2009). 

9. Westfield, G. H. et a/. Structural flexibility of the Gas a-helical domain in the B2- 
adrenoceptor Gs complex. Proc. Nat! Acad. Sci. USA 108, 16086-16091 (2011). 

10. Rasmussen, S. G. et a/. Crystal structure of the human Bz adrenergic G-protein- 
coupled receptor. Nature 450, 383-387 (2007). 

11. Rasmussen, S. G. et a/. Structure of a nanobody-stabilized active state of the Bo 
adrenoceptor. Nature 469, 175-180 (2011). 


222 | NATURE | VOL 512 | 14 AUGUST 2014 


12. Oakley, R.H., Laporte, S.A., Holt, J.A., Caron, M. G. & Barak, L. S. Differential affinities 
of visual arrestin, B arrestin1, and B arrestin2 for G protein-coupled receptors 
delineate two major classes of receptors. J. Biol. Chem. 275, 17201-17210 (2000). 

13. Shukla, A. K. et a/. Structure of active B-arrestin-1 bound to a G-protein-coupled 
receptor phosphopeptide. Nature 497, 137-141 (2013). 

14. Goodman, O. B. Jr etal. B-Arrestin acts as a clathrin adaptor in endocytosis of the 
Bo-adrenergic receptor. Nature 383, 447-450 (1996). 

15. Nobles, K. N., Guan, Z., Xiao, K., Oas, T. G. & Lefkowitz, R. J. The active conformation 
of B-arrestin1: direct evidence for the phosphate sensor in the N-domain and 
conformational differences in the active states of B-arrestins1 and -2. J. Biol. Chem. 
282, 21370-21381 (2007). 

16. Xiao, K., Shenoy, S. K., Nobles, K. & Lefkowitz, R. J. Activation-dependent 
conformational changes in B-arrestin 2.J. Biol. Chem. 279, 55744-55753 (2004). 

17. Gurevich, V. V. & Gurevich, E. V. The molecular acrobatics of arrestin activation. 
Trends Pharmacol. Sci. 25, 105-111 (2004). 

18. Chung, K. Y. et al. Conformational changes in the G protein Gs induced by the Bo 
adrenergic receptor. Nature 477, 611-615 (2011). 

19. Konermann, L., Pan, J. & Liu, Y. H. Hydrogen exchange mass spectrometry for 
studying protein structure and dynamics. Chem. Soc. Rev. 40, 1224-1234 (2011). 

20. Kim, M. et al. Conformation of receptor-bound visual arrestin. Proc. Nat! Acad. Sci. 
USA 109, 18407-18412 (2012). 

21. Hanson, S. M. et al. Differential interaction of spin-labeled arrestin with inactive 
and active phosphorhodopsin. Proc. Nat! Acad. Sci. USA 103, 4900-4905 
(2006). 

22. Zhuang, T. et al. Involvement of distinct arrestin-1 elements in binding to 
different functional forms of rhodopsin. Proc. Nat! Acad. Sci. USA 110, 942-947 
(2013). 

23. Gimenez, L. E., Vishnivetskiy, S. A., Baameur, F. & Gurevich, V. V. Manipulation of 
very few receptor discriminator residues greatly enhances receptor specificity of 
non-visual arrestins. J. Biol. Chem. 287, 29495-29505 (2012). 

24. Gurevich, V.V.& Gurevich, E. V. Structural determinants of arrestin functions. Prog. 
Mol. Biol. Transl. Sci. 118, 57-92 (2013). 

25. Lohse, M.J.& Hoffmann, C. Arrestin interactions with G protein-coupled receptors. 
Handb. Exp. Pharmacol. 219, 15-56 (2014). 

26. Radermacher, M., Wagenknecht, T., Verschoor, A. & Frank, J. Three-dimensional 
reconstruction from a single-exposure, random conical tilt series applied to the 
50S ribosomal subunit of Escherichia coli. J. Microsc. 146, 113-136 (1987). 

27. Kim, K.M. & Caron, M. G. Complementary roles of the DRY motif and C-terminus 
tail of GPCRS for G protein coupling and B-arrestin interaction. Biochem. Biophys. 
Res. Commun. 366, 42-47 (2008). 

28. Yang, Z., Fang, J., Chittuluru, J., Asturias, F. J. & Penczek, P. A. Iterative stable 
alignment and clustering of 2D transmission electron microscope images. 
Structure 20, 237-247 (2012). 


Acknowledgements We thank D. Capel for technical assistance, V. Ronk, D. Addison 
and Q. Lennon for administrative support, R. K. Sunahara for stimulating discussions 
and Alex R. B. Thomsen for critical reading of the manuscript. We acknowledge support 
from the National Institutes of Health Grants DKO90165 (G.S.), NSO28471 (B.K.K.), 
GM072688 and GM087519 (AA.K. and S.K.), HLO75443 (K.X.), HL16037 and 
HL70631 (R.J.L.), from the Mathers Foundation (B.K.K.), GM60635 (P.A.P.) and from 
the Pew Scholars Program in Biomedical Sciences (G.S.). R.H. and S.S.S. were 
supported by a research grant from the Canadian Institutes of Health Research 
(MOP-93725). R.I.R is supported by a postdoctoral fellowship from Coordenacao de 
Aperfeicoamento de Pessoal de Nivel Superior. RJ.L. is an investigator with the Howard 
Hughes Medical Institute. 


Author Contributions A.K.S. designed and optimized procedures for forming and 
purifying the complex, executed and optimized the on-column crosslinking protocol 
and provided the preparations of complex used for EM, HDX-MS and crosslink 
mapping experiments with assistance from P.T.-S. R.I.R. and L-Y.H. performed 
biochemical and pharmacological characterization of the complex. G.H.W. performed 
EM analysis assisted by M.S., A.N.O. and A.M.D. and supervised by G.S. K.X. performed 
the HDX-MS experiments assisted by S.L,, J.Q., A.W.K. and A.B., performed the crosslink 
mapping experiments assisted by J.Q. and A.W.K., and designed the disulphide 
trapping experiments carried out by M.C. V.L.W. Jr supervised the initial phase of the 
HDX-MS experiments. C.-R.L., L.-L.G., J.-M.S. and X.C. synthesized the high-affinity 
agonist BI-167107. R.H. and S.S.S. provided the linker sequence, vector and advice on 
ScFv conversion and expression. X.J.Y. and B.U.K. contributed in assessing various 
methods of complex formation. P.A.P. provided advice on implementation of ISAC?°. 
S.K. and A.A.K. provided the phage display library and protocols for Fab selection, 
expression and purification. B.K.K. conceived the on-column crosslinking strategy, 
advised A.K.S. on its execution and optimization, assisted with molecular modelling of 
the complex and participated in supervision of the project. G.S. directly supervised the 
EM studies, performed the molecular modelling of the complex and supervised overall 
project execution. RJ.L. supervised overall project design and execution. A.K.S., G.H.W., 
K.X., G.S., B.K.K. and RJ.L. participated in data analysis and interpretation. A.K.S., 
G.H.W., K.X,, G.S., B.K.K. and R.J.L. wrote the manuscript. All authors have seen and 
commented on the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to RJ.L. 
(lefko00 1 @receptor-biol.duke.edu), B.K.K. (kobilka@stanford.edu) or G.S. 
(skinioti@umich.edu). 


©2014 Macmillan Publishers Limited. All rights reserved 


ART4ALL/SHUTTERSTOCK 


CAREERS 


@NATUREJOBS Follow us on Twitter for the 
latest on science jobs twitter.com/naturejobs 


NATUREJOBS FACEBOOK Science-careers 
advice & tips www.facebook.com/naturejobs 


NATUREJOBS For the latest career 
listings and advice www.naturejobs.com 


MENTAL HEALTH 


Stressed students 


reach out for help 


Graduate students struggling with the stresses of their work 
and lives can tap into multiple avenues of support. 


BY JULIE GOULD 


S arah Gossan got mostly ‘A’s during her 


undergraduate astrophysics programme 
at Cardiff University, UK, and graduated 
at the top of her class in 2012. In her third 
year, she started to study for the Graduate 
Record Examination (GRE), a standard test 


for admission to US graduate programmes, 
in the hope of starting a PhD in gravitational 
waves at the California Institute of Technology 
(Caltech) in Pasadena after she graduated. 
What her undergraduate peers and supervi- 
sors did not know was that she was struggling 
to deal with severe bulimia and depression. 
“The stress from research towards the end of 


my third year and the grad-school application 
process led to a relapse for about 8 months,” 
Gossan says. “I almost took a sick sense of 
pride in performing well academically while 
mentally ill” 

Gossan did not reach out for help. “I was too 
embarrassed to ask for help for the bulimia, 
and it continued to get worse,’ she says. Then, 
once she was accepted at Caltech, the transi- 
tion to the new environment caused her yet 
more stress. But she kept quiet because she 
did not want to appear weak, particularly as 
a woman in science. “I was afraid of being 
painted as ‘just another emotional woman,” 
she says. The consequences were dismal: she 
failed the PhD qualifying exams twice in addi- 
tion to an exam on classical physics, went on 
strong medication and did not attend classes 
for almost five months. 

Gossan's experience is not unique. Maintain- 
ing mental health as a researcher-in-training 
can bea contradiction in terms. Many doc- 
toral students are free to pursue a scientific 
field of their choice and, at least in theory, get 
an opportunity to become a leading researcher 
in that field. But the need to publish often, con- 
duct research independently, constantly apply 
for funding and meet the needs of supervisors 
can create substantial emotional and mental 
strain, anxiety and pressure. These hurdles can 
adversely affect a PhD student’s emotional well- 
being, especially if they are not expecting them 
— or do not know how to surmount them. 

There isa high risk of developing a major psy- 
chiatric illness, such as depression, schizophre- 
nia, and bipolar or anxiety disorder, between 
the ages of 18 and 24 — just when students are 
pursuing degrees, says Victor Schwartz, who is 
medical director at the Jed Foundation, a chari- 
table organization in New York that aims to 
reduce suicide rates and improve mental health 
for university students. “This is a time of tran- 
sition from adolescence into adulthood, and 
often from undergraduate to graduate studies,” 
he explains. “Students experience many firsts, 
including new lifestyles, friends, roommates, 
cultures and ways of thinking.” Graduate stu- 
dents move off campus and become further 
removed from support networks, conduct more 
independent research and face uncertain career 
prospects, thanks to an unsteady regional and 
global job market, he says. Combine academic 
stresses with this transition, and it is not sur- 
prising that many doctoral students struggle to 
maintain mental health. 

Nearly one-fifth of the general US populace 
over the age of 18 — and 13% of master’s > 
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> or PhD students — suffer from anxiety or 
depression (D. Eisenberg et al. Am. J. Orthopsy- 
chiatry 77, 534-542; 2007). Those in doctoral 
programmes are especially susceptible, says 
Catherine McAteer, head of student services at 
University College London (UCL). “PhD stu- 
dents tend to spend a lot of time by themselves 
doing their research in a lab or writing their 
theses, and isolation is often an instant path- 
way to depression and anxiety.’ To combat the 
procrastination that goes hand-in-hand with 
isolation, UCL offers classes for PhD students 
to help them to focus attention on the present 
(see ‘Mind tricks’). 

In a 2013 US poll of 41,847 undergraduate 
and graduate students, almost one-third said 
that they had “felt so depressed that it was dif- 
ficult to function” in the past year. And nearly 
half said that their academic programme 
— their studies, research, lab colleagues and 
supervisors — had been “very difficult to han- 
dle” in the past year. 

Schwartz's advice to people who are strug- 
gling with depression, anxiety and other disor- 
ders is to reach out to others — whether they 
are friends, loved ones or counselling services. 
Schwartz and McAteer both advocate taking 


MIND TRICKS 
How mindfulness works 


Mindfulness is a therapeutic practice 
that helps to increase awareness of the 
present, which can improve thinking 
habits and mental health. Imagine, for 
instance, that your experiment has gone 
wrong: the data are not coming together 
and your deadline is tomorrow. Your 
usual response is to panic. Instead, you 
can: 

@ Take three deep breaths. This 
stimulates the vagus nerve, which 
releases a chemical called acetylcholine 
that will calm you down. 

@ Concentrate on the here and now in 

a non-judgemental way. Rather than 
blaming yourself, take a step back. 

By objectively acknowledging your 
frustrations, you will be able to see the 
problems more clearly and focus on how 
to solve them. 

@ Keep your focus on a single object, 
idea or sensation rather than letting your 
mind wander off. 

@ Stay aware of your body and its 
response to internal and external stimuli. 
@ Reframe your emotions in a positive 
way. In the wake of negative thoughts or 
experiences, this can help you to react 
less emotionally and to be more resilient. 
@ Try to be as objective as possible 

in terms of the way you think about 
yourself. J.G. 


Active Minds runs groups, like this one in Pennsylvania, that encourage students to discuss depression. 


advantage of on-campus mental-health ser- 
vices, which offer options such as group coun- 
selling, one-on-one sessions and peer support, 
in which students form a network that aims 
to promote mental health and provide confi- 
dential help. Campus peer groups are becom- 
ing more common: examples include Active 
Minds, which is based in Washington DC but 
has groups around the world, Peer Ears at the 
Massachusetts Institute of Technology in Cam- 
bridge and Cause for Concern at UCL. 

To reduce the likelihood of developing 
mental-health problems, doctoral research- 
ers should try to build a solid and trustworthy 
peer group in the early days of their pro- 
gramme, says Charlotte Vaughan, the disabil- 
ity adviser for mental health at UCL. This can 
be accomplished by joining discipline-based 
societies and clubs, or networks set up by the 
university mental-health services. “Most of 
all, we want to make sure that students are 
aware of the possible mental-health condi- 
tions they may face,’ she says, “and know 
where they need to go if they think they’re 
running into trouble” 

It was not until earlier this year that Gos- 
san started looking for serious help, after 
her partner asked her to do so. She began by 
speaking to Christian Ott, her supervisor at 
Caltech, who reassured her that “many, if not 
most people, cope with a whole range of men- 
tal issues”. Ott had dealt with his own problems 
in the past, and he values being open about the 
topic of mental health. “I made it clear that it 
is acommon thing to run into such problems 
and that getting help and looking ahead is the 
important thing to do,’ he says. 

Once she knew how Ott felt, Gossan spoke 
to him whenever she found she was falling 
back into old, unhealthy habits. “I work very 
actively to prevent a relapse,’ says Ott. “This 
sometimes involves telling her specifically 
what to do or not to do” 

If left untreated, mental-health problems 
can lead to suicide, says Charles Reynolds, a 
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behavioural and community-health scientist 
at the Graduate School of Public Health of 
the University of Pittsburgh in Pennsylva- 
nia. A US survey in 2009 found that 4% of 
graduate students had “seriously considered 
attempting suicide” in the past 12 months 
(D. J. Drum et al. Prof. Psychiatry Res. Prac. 
40, 213-222; 2009), and in 2011, the Ameri- 
can College Health Association reported 
that suicides were the leading cause of death 
in undergraduate and graduate students. 
“We need to remove this stigma attached to 
mental-health problems and find a way to get 

students to talk,” Reynolds says. 
Active Minds and Peer Ears are helpful 
in terms of intervention and treatment, say 
those involved in the 


“We need to networks. Talking 
remove this discreetly with peer- 
stigmaattached support-group rep- 
to mental-health tesentatives can help 
problems and to ease the fear that 
ind away to opening up about 
ee eae mental-health woes 
to talk.” will have academic 


and other ramifica- 
tions. Indeed, Gossan 
was warned by a fel- 
low doctoral student against telling a supervi- 
sor about her depression. The colleague, who 
had suffered from mental-health disorders, 
told her that people would view her as unre- 
liable and would not want to work with her, 
illustrating that advice from untrained peers 
may not be always reliable. 

Ultimately, experts say, many mental-health 
issues, including depression, can be resolved 
only by talking to others — whether a counsel- 
lor, supervisor or peer representative. Gossan 
is grateful that Ott has been there for her. “He 
helped me through many anxiety attacks,” she 
says, “and without his support I think I prob- 
ably would have dropped out by now.” m 


Charles Reynolds 
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THE DEATH OF IMMORTALITY 


BY KYLE L. WILSON & ANDREW B. BARBOUR 


trembled while checking the readings. 
“Your son, he’s going to die.” 
Tears streamed down the mother’s 
face. She gripped her newborn tightly. 
“What does that mean, he’s going to 
die? How, how long does that take?” 
The doctor hesitated. “Maybe a cen- 
tury. The genetic implants did not take. 
Somehow, the testing failed.” Pulling 
up a millennium-old document in the 
Venter Laboratories database, he found 
the passage. “Continually shrinking 
telomeres. He will age and develop 
archaic tumours and, eventually, can- 
cer. His body will fail. We haven't seen 
anything like this since the Mortal Era” 
Jaw clenched, she chewed on her 
words. “Do you know how many 
centuries it took to get approval for 
a child? And now what... he’s like... 
like one of our dogs!” 


¢C C [: all my years, I’ve never...” the doctor 


Rosalind had followed the child since 
he first left his arcology six months 
ago. For much of Michael’s jour- 
ney, they were not alone. The world 
watched, captivated by the mortal’s deci- 
sion to leave his risk-free home. Rosalind, 
motivated by desire for advancement in a 
stagnant workforce, filmed his trek. 

“Michael grew a grey hair. He's so differ- 
ent! The fans will go crazy when they see. And 
our updated ‘road map, as he terms it, shows 
that we will lose communication for weeks. 
He wants to climb some mountain in a place 
called the Himalayas. And after that, Europe!” 

Her boss was baffled. “You mean Europa? 
Surely hed want to visit the moons...” 

“No, Europe. Sir, I don’t understand him? 

“Who wants to go to that dreadful place? 
Since the floods it’s been deserted. This story 
is great Rosy, can you figure out the mindset 
of the last person that will ever die?” 

“He says it’s to connect with his roots, to 
experience history. But I think he’s insane. 
This is becoming too much for me. I need to 
be transferred.” 

“That’s impossible, Rosy. No one else will 
risk leaving the arcologies. We need your 
reports for our ratings. Keep this up and I'll 
promote you within a century.” 


Despite her years, the reporter maintained a 
youthful complexion with rose-red cheeks 
and blonde hair. Next to her, the grey-haired 


Life lessons. 


and sun-baked man. The stunning 400-year- 
old girl and ancient 90-year-old man 
traversed the sands of the old-world desert. 

He grinned like a mad Cheshire at the 
sight of the dunes. “The Arabian desert. 


Where Lawrence fought all those years ago.” 
She smiled warmly with youthful radiance, 
comforted by the decades-long bond the 
pair had formed. “I never knew such a place 
existed. Michael, it’s stunning!” She was rather 
hot and needed some water. To her left, off 
in the distance, she noticed a snake slithering 
away. Impossible a thing could survive here. 

They camped under the Milky Way’s glow. 
She worked up a fire, a skill learned from 
Michael years ago. Hundreds of years and I 
had never made a fire before... 

Suddenly, Michael started to cough. His 
once-strong legs shook as they strained to 
hold him upright. “Rose, I’m feeling weak 
now. The weakest since I left the hospital.” 
He sounded worried, despite his typical 
confidence. 

She was worried as well. She didn’t know 
how to act around a dying man, no one did. 
“Michael, why don’t we head back to the 
city —” 

A crackling voice interrupted. “Where are 
you?! No updates in three months! We need 

a new report, we can 
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composure while concerned for her dying 
friend. “It’s not important now, Sir. ve been 
recording our observations and I'll sub- 
mit them when we head back to the city.” 
Abruptly, she turned offher communicator. 

“Michael, let’s go back. It’s been 
walk walk walk for decades. You're 
fatigued” The white lie made her sad. 
What else could she say? Only he knew 
how to feel — after all, hed spent years 
reading antiquated books on religion 
and death. 


Back in the city hospital, Michael 
slipped away while Rosalind desper- 
ately gripped his hand. Really, the 
whole world held him. His death was 
broadcast live across the entire system, 
from Earth to Ganymede. Just like the 
doctor said: “Old age.” 

The audience watched his last 
breath. They didn’t know what to 
make of it, wondering where he 
would go. Distressed and feeling that 
lingering existential dread, the world 
switched channels. 


“Welcome to the Records and Appli- 
cation office, how may we serve you... 
Oh! Rosalind! I loved your work on Michael.” 

Rose exchanged pleasantries before ask- 
ing for the necessary forms. “I would like to 
apply for a child” 

Held lightly in her hand, the pen danced 
across the tedious application. Upon review, 
she heard the administrator chuckle while 
flipping through her forms. 

‘A girl huh? That's great, I'm sure she will be 
just as adventurous as her mother!” The man 
read on. “Oh, uh... I see you've filled out the 
liability waiver to decline genetic implants. 
Funny, that’s the fourth time I’ve seen that 
this week” His pen darted a note. “Well, the 
application is in order, but I'm required by law 
to advise you against this choice. After all, you 
know better than anyone the severe disabil- 
ity your child will face.” Rosalind, warmly 
remembering her last days with Michael, 
nodded in acknowledgement. = 
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